Deploying a Serverless AI Agent with AWS Bedrock, Lambda, and API Gateway

Swati Tyagi — Sun, 28 Dec 2025 01:15:14 +0000

This guide walks through building a question-answering service powered by GenAI using AWS bedrock. The architecture accepts prompts via HTTP and returns model-generated responses using Amazon Bedrock—all while keeping costs minimal through serverless infrastructure.

Architecture

Credit : https://aws.amazon.com/blogs/architecture/building-an-ai-gateway-to-amazon-bedrock-with-amazon-api-gateway/

Data Flow:

External clients send HTTP requests to API Gateway
API Gateway routes requests to the Lambda function
Lambda invokes Amazon Bedrock's Nova Micro model
ECR stores the Lambda container image (deployment artifact)

The Challenge

When implementing generative AI services, choosing the right architecture matters. This implementation demonstrates a lightweight GenAI solution that can integrate with existing systems or be exposed externally through an API.

Requirements

Functional Goals

Requirement	Description
Prompt Processing	Accept prompts and return Nova Micro completions
HTTP Endpoint	Expose an endpoint for triggering responses
Estimated Volume	~100 monthly requests (for cost estimation)

Operational Goals

Aspect	Requirement
Automation	Fully automated deployment via GitHub Actions
Availability	99.9%+ monthly uptime
Security	IAM-scoped Bedrock access, OpenID Connect auth, HTTPS-only
Observability	Structured logging with CloudWatch dashboards

Intentional Omissions

Authentication, input sanitization, and authorization are excluded to keep focus on the core GenAI implementation.

Cost Analysis

Based on an estimated 22 input tokens and 232 output tokens per request:

Service	Monthly Cost	Notes
Bedrock (Nova Micro)	~$0.003	2,200 input / 23,200 output tokens
Lambda	Free	Within free tier (1M requests, 400K GB-seconds)
API Gateway	Free (Year 1)	~$0.0004/month after
ECR	~$0.01	300MB image after 500MB free tier

Scaling Projections

Monthly Requests	Estimated Cost
1,000	~$0.04
10,000	~$0.39
100,000	~$3.76

Building the Agent

Project Setup

mkdir -p handler terraform 
cd handler
pnpm init -y
pnpm --package=typescript dlx tsc --init
mkdir -p src __tests__
touch src/{app,env,index}.ts 

pnpm add -D @types/node tsx typescript
pnpm add ai @ai-sdk/amazon-bedrock zod dotenv

Core Components

The implementation has three layers:

flowchart TB
    A["Lambda Handler<br/><i>Parses events, returns responses</i>"] --> B["Application Logic<br/><i>Manages prompts & orchestration</i>"]
    B --> C["Bedrock Integration<br/><i>Model invocation via AI SDK</i>"]

Lambda Entry Point

// index.ts
export const handler = async (event: any, context: any) => {
    try {
        const body = event.body ? JSON.parse(event.body) : {};
        const prompt = body.prompt ?? "Welcome from Warike technologies";
        const response = await main(prompt);
        return {
            statusCode: 200,
            body: JSON.stringify({ success: true, data: response }),
        };
    } catch (error) {
        return {
            statusCode: 500,
            body: JSON.stringify({
                success: false,
                error: error instanceof Error ? error.message : 'Unexpected error'
            }),
        };
    }
};

Bedrock Integration

// utils/bedrock.ts
export async function generateResponse(prompt: string) {
    const { regionId, modelId } = config({});
    const bedrock = createAmazonBedrock({ region: regionId });

    const { text, usage } = await generateText({
        model: bedrock(modelId),
        system: "You are a helpful assistant.",
        prompt: [{ role: "user", content: prompt }],
    });

    console.log(`model: ${modelId}, response: ${text}, usage: ${JSON.stringify(usage)}`);
    return text;
}

Environment Variables

AWS_REGION=us-west-2
AWS_BEDROCK_MODEL='amazon.nova-micro-v1:0'
AWS_BEARER_TOKEN_BEDROCK='aws_bearer_token_bedrock'

⚠️ Security Note: Use short-lived API keys only.

Infrastructure

Dockerfile

# Build Stage
FROM node:22-alpine AS builder
WORKDIR /usr/src/app
RUN corepack enable
COPY package.json pnpm-lock.yaml* ./
RUN pnpm install --frozen-lockfile
COPY . .
RUN pnpm run build

# Runtime Stage
FROM public.ecr.aws/lambda/nodejs:22
WORKDIR ${LAMBDA_TASK_ROOT}
COPY --from=builder /usr/src/app/dist/src ./ 
COPY --from=builder /usr/src/app/node_modules ./node_modules
CMD [ "index.handler" ]

Terraform Resources

Key infrastructure components:

API Gateway — HTTP protocol with Lambda integration, CORS headers, JSON access logs
Bedrock Permissions — Nova Micro inference profile access
Lambda Function — 900-second timeout, CloudWatch logging enabled

📝 Note: The ECR seeding resource requires Docker running locally.

CI/CD Pipeline

flowchart LR
    A[Push to Main] --> B[Build & Test]
    B --> C[Build Docker Image]
    C --> D[Push to ECR]
    D --> E[Deploy Lambda]

The GitHub Actions workflow handles building, testing, Docker image creation, ECR push, and Lambda deployment—triggered on pushes to main.

Testing

curl -sS "https://123456.execute-api.us-west-2.amazonaws.com/dev/" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Heeey hoe gaat het?"}' | jq

Expected Response:

{
  "success": true,
  "data": "Hoi! Het gaat prima, bedankt voor het vragen..."
}

Monitoring

CloudWatch dashboards provide visibility into errors and performance metrics.

Cleanup

terraform destroy

Takeaways

✅ Serverless GenAI with API Gateway, Lambda, and Bedrock's Nova Micro delivers a functional, cost-effective solution

✅ Pricing remains negligible even at significant scale

✅ Terraform handles infrastructure; GitHub Actions automates deployment

✅ Foundation readily supports more sophisticated generative AI applications

Comparing Cloud AI Platforms in 2025: Bedrock, Azure OpenAI, and Gemini

Swati Tyagi — Sun, 28 Dec 2025 01:06:21 +0000

Picking the right cloud AI service comes down to more than raw model performance. Your decision hinges on how well it meshes with your current stack, what it costs, and whether it meets your compliance needs.

Having shipped production workloads on each of these platforms, here's what I've learned.

The Quick Version

Platform	Ideal Use Case	What Sets It Apart
AWS Bedrock	Switching between multiple models	Smart routing that picks the right model automatically
Azure OpenAI	Enterprise access to GPT	Tight Microsoft 365 connectivity
Gemini API	Processing huge documents	Context window up to 2M tokens

AWS Bedrock

Bedrock is Amazon's managed gateway to foundation models from Anthropic, Meta, Mistral, Cohere, and others—all through one unified API.

Why it stands out:

Access Claude, Llama, Mistral, and Stable Diffusion without juggling multiple integrations
Automatic prompt routing selects the most cost-effective model for each request (potential 30% savings)
Plugs directly into S3, Lambda, and SageMaker
Native RAG support with built-in vector storage

Pricing snapshot (Claude 3.5 Sonnet): $3/million input tokens, $15/million output tokens. Batch processing cuts costs in half.

Best fit: Teams already on AWS who want model flexibility and strong compliance credentials.

Azure OpenAI

Microsoft's enterprise-grade wrapper around OpenAI's models, with security and governance baked in.

Why it stands out:

Direct access to GPT-4o, o1, DALL-E 3, and Whisper
Seamless hooks into Teams, Power Platform, and the broader Microsoft ecosystem
Your data stays private and isn't used for training
Provisioned Throughput Units (PTUs) for predictable billing

Pricing snapshot (GPT-4o): $2.50/million input tokens, $10/million output tokens.

Best fit: Organizations already running Microsoft infrastructure who specifically need OpenAI models.

Gemini API

Google's multimodal platform with an industry-leading context window and native support for text, images, audio, and video.

Why it stands out:

2M token context—roughly 8x what GPT-4 offers
True multimodal processing without preprocessing steps
Built-in web search grounding for real-time information
Generous free tier (1,500+ daily requests)

Pricing snapshot (Gemini 2.5 Pro): $1.25/million input tokens (under 200K context), $10/million output tokens.

Best fit: Document-heavy applications, multimodal use cases, or teams prototyping on a budget.

How to Decide

Already deep in AWS? → Bedrock
Need GPT-4 specifically? → Azure OpenAI
Processing documents over 200K tokens? → Gemini
Early-stage or budget-conscious? → Gemini's free tier
Want to experiment across models? → Bedrock

Saving Money

Bedrock: Use batch mode and smart routing; enable prompt caching
Azure: Reserve PTUs for steady workloads; use batch API for non-urgent tasks
Gemini: Max out the free tier during development; use Flash models when speed matters less

Bottom Line

Each platform excels in different scenarios. Bedrock offers unmatched model flexibility. Azure OpenAI delivers the smoothest experience for Microsoft-centric teams. Gemini's massive context window changes what's possible for document analysis.

No single platform wins across the board—your best choice depends on your existing infrastructure, specific model requirements, and budget. And honestly? You'll probably end up using more than one. 😅

Forem: Swati Tyagi

Deploying a Serverless AI Agent with AWS Bedrock, Lambda, and API Gateway

Architecture

The Challenge

Requirements

Functional Goals

Operational Goals

Intentional Omissions

Cost Analysis

Scaling Projections

Building the Agent

Project Setup

Core Components

Lambda Entry Point

Bedrock Integration

Environment Variables

Infrastructure

Dockerfile

Terraform Resources

CI/CD Pipeline

Testing

Monitoring

Cleanup

Takeaways

Comparing Cloud AI Platforms in 2025: Bedrock, Azure OpenAI, and Gemini

The Quick Version

AWS Bedrock

Azure OpenAI

Gemini API

How to Decide

Saving Money

Bottom Line