<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Abhishek Gupta</title>
    <description>The latest articles on Forem by Abhishek Gupta (@abhishek_gupta_pinpo).</description>
    <link>https://forem.com/abhishek_gupta_pinpo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3888783%2Fddd00119-fad5-440b-8a81-734215d9c447.png</url>
      <title>Forem: Abhishek Gupta</title>
      <link>https://forem.com/abhishek_gupta_pinpo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/abhishek_gupta_pinpo"/>
    <language>en</language>
    <item>
      <title>DynamoDB vs RDS at 10K, 100K, and 1M RPS: a pre-deployment simulation comparison</title>
      <dc:creator>Abhishek Gupta</dc:creator>
      <pubDate>Thu, 23 Apr 2026 23:41:00 +0000</pubDate>
      <link>https://forem.com/abhishek_gupta_pinpo/dynamodb-vs-rds-at-10k-100k-and-1m-rps-a-pre-deployment-simulation-comparison-3eco</link>
      <guid>https://forem.com/abhishek_gupta_pinpo/dynamodb-vs-rds-at-10k-100k-and-1m-rps-a-pre-deployment-simulation-comparison-3eco</guid>
      <description>&lt;p&gt;I have made this mistake exactly once. About three years into my AWS career, I inherited a Lambda-based API with DynamoDB on the backend and was tasked with migrating it to Aurora PostgreSQL - the data model had grown relational and the team wanted proper foreign key constraints.&lt;/p&gt;

&lt;p&gt;The migration went smoothly in UAT. We promoted to production on a Tuesday night. By Thursday morning, Lambda concurrency was exhausted, Aurora was throwing connection pool errors, and I was sitting in a war room with the CTO trying to explain why a database migration - not a code change - had caused a full API outage at 80K RPS.&lt;/p&gt;

&lt;p&gt;I had never tested what the connection behaviour would look like at production load. I had assumed it would be fine.&lt;/p&gt;

&lt;p&gt;It was not fine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You cannot assume your way through database selection at scale.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The methodology
&lt;/h2&gt;

&lt;p&gt;I ran a structured simulation comparison using pinpole's pre-deployment canvas. Two separate canvases - one for DynamoDB, one for RDS/Aurora - with a third for Aurora Serverless v2 as a middle-ground comparison.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Canvas topology (both configurations):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Route 53 → CloudFront → API Gateway → Lambda → [DynamoDB | RDS + Proxy]
WAF (in front of CloudFront) · SQS (write decoupling path) · ElastiCache (RDS scenario)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note on RDS Proxy:&lt;/strong&gt; If you're running Lambda against RDS at any significant scale without RDS Proxy managing the connection pool, you will exhaust database connections under burst load. This is essentially the architecture bug that caused my Tuesday night disaster. RDS Proxy is always present in the RDS/Aurora configurations below.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;All configurations at production-realistic specs: Lambda at 1,769 MB (1 vCPU equivalent), 30-second timeout; API Gateway at 10K RPS burst limit; DynamoDB in both on-demand and provisioned modes; RDS PostgreSQL on db.r6g instances; Aurora Serverless v2 with ACU limits appropriate to each tier.&lt;/p&gt;

&lt;p&gt;I ran four traffic patterns at each RPS level: &lt;strong&gt;Constant&lt;/strong&gt; (steady baseline), &lt;strong&gt;Ramp&lt;/strong&gt; (linear growth to peak over 10 minutes), &lt;strong&gt;Spike&lt;/strong&gt; (sudden 10× burst), and &lt;strong&gt;Wave&lt;/strong&gt; (oscillating between 30% and 100% of peak). Each run saved to execution history for comparison.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results at 10K RPS
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;p50&lt;/th&gt;
&lt;th&gt;p99&lt;/th&gt;
&lt;th&gt;Monthly Estimate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB on-demand&lt;/td&gt;
&lt;td&gt;2ms&lt;/td&gt;
&lt;td&gt;7ms&lt;/td&gt;
&lt;td&gt;~$8,400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB provisioned (9K RCU / 1K WCU)&lt;/td&gt;
&lt;td&gt;2ms&lt;/td&gt;
&lt;td&gt;7ms&lt;/td&gt;
&lt;td&gt;~$1,150&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RDS PostgreSQL db.r6g.2xlarge + Proxy&lt;/td&gt;
&lt;td&gt;3ms&lt;/td&gt;
&lt;td&gt;11ms&lt;/td&gt;
&lt;td&gt;~$870&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aurora MySQL db.r6g.2xlarge + Proxy&lt;/td&gt;
&lt;td&gt;4ms&lt;/td&gt;
&lt;td&gt;13ms&lt;/td&gt;
&lt;td&gt;~$980&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aurora Serverless v2 (avg 4 ACU)&lt;/td&gt;
&lt;td&gt;4ms&lt;/td&gt;
&lt;td&gt;15ms&lt;/td&gt;
&lt;td&gt;~$720&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The biggest surprise at 10K RPS: &lt;strong&gt;DynamoDB on-demand is nearly 10× more expensive than a well-configured RDS instance for sustained, predictable traffic.&lt;/strong&gt; DynamoDB's reputation as the "serverless database" leads engineers to assume it is cheap at modest scales. For a product with consistent diurnal load patterns, provisioned capacity is rarely the wrong answer, and on-demand is rarely the right one.&lt;/p&gt;

&lt;p&gt;Under the Spike pattern (10K → 100K instantaneous), DynamoDB on-demand absorbed the spike without configuration changes. RDS PostgreSQL with a fixed instance showed connection pool pressure - p99 climbed to 38ms for about 90 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results at 100K RPS
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;p50&lt;/th&gt;
&lt;th&gt;p99&lt;/th&gt;
&lt;th&gt;Monthly Estimate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB provisioned (auto-scaling)&lt;/td&gt;
&lt;td&gt;2ms&lt;/td&gt;
&lt;td&gt;9ms&lt;/td&gt;
&lt;td&gt;~$4,200&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RDS PostgreSQL db.r6g.8xlarge + Proxy&lt;/td&gt;
&lt;td&gt;3ms&lt;/td&gt;
&lt;td&gt;14ms&lt;/td&gt;
&lt;td&gt;~$3,100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aurora MySQL db.r6g.8xlarge + Proxy&lt;/td&gt;
&lt;td&gt;4ms&lt;/td&gt;
&lt;td&gt;16ms&lt;/td&gt;
&lt;td&gt;~$3,400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aurora Serverless v2 (avg 18 ACU)&lt;/td&gt;
&lt;td&gt;4ms&lt;/td&gt;
&lt;td&gt;19ms&lt;/td&gt;
&lt;td&gt;~$2,900&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At 100K RPS, DynamoDB on-demand becomes structurally expensive. The per-request pricing model that looks benign at 10K RPS scales linearly. Provisioned DynamoDB with auto-scaling changes the picture significantly. RDS cost is still competitive - fixed instance overhead is now amortised more efficiently.&lt;/p&gt;

&lt;p&gt;The Spike pattern at this tier produced the most diagnostic information. DynamoDB auto-scaling took 3-7 minutes to fully respond - during which pinpole flagged elevated p99 and recommended more aggressive scale-out settings. This is behaviour you need to know before deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results at 1M RPS
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;p50&lt;/th&gt;
&lt;th&gt;p99&lt;/th&gt;
&lt;th&gt;Monthly Estimate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB provisioned (high WCU, DAX)&lt;/td&gt;
&lt;td&gt;1ms&lt;/td&gt;
&lt;td&gt;4ms&lt;/td&gt;
&lt;td&gt;~$28,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RDS PostgreSQL read replicas + Proxy&lt;/td&gt;
&lt;td&gt;4ms&lt;/td&gt;
&lt;td&gt;22ms&lt;/td&gt;
&lt;td&gt;~$18,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aurora Global + Proxy&lt;/td&gt;
&lt;td&gt;3ms&lt;/td&gt;
&lt;td&gt;15ms&lt;/td&gt;
&lt;td&gt;~$24,000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At 1M RPS, DynamoDB with provisioned capacity and DAX caching is competitive on cost and substantially superior on latency. The operational complexity of the RDS path has increased materially - you now need read replicas, connection pooling strategy, and careful instance sizing - while the cost gap has narrowed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual decision framework
&lt;/h2&gt;

&lt;p&gt;The database selection cannot be made responsibly without running the numbers at your anticipated traffic volume. The right answer at 10K RPS is sometimes the wrong answer at 100K RPS. The three factors that matter:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Access pattern complexity.&lt;/strong&gt; If your queries require joins, complex filtering, or ad-hoc analytical access, RDS is the correct starting point regardless of the cost model. DynamoDB's cost advantage evaporates if you are engineering around its access pattern constraints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Traffic predictability.&lt;/strong&gt; Predictable diurnal load → provisioned DynamoDB or fixed RDS instance. Genuinely unpredictable or bursty traffic → DynamoDB on-demand or Aurora Serverless v2. Do not pay on-demand pricing for predictable traffic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Scale trajectory.&lt;/strong&gt; If you are at 10K RPS today and heading for 1M RPS in 18 months, the database you choose now needs to perform well at that scale. Running the 1M RPS simulation before making the 10K RPS decision is an hour of canvas work, not a Thursday morning war room.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Full comparison with complete per-node metrics at each tier, Aurora Serverless v2 deep-dive, and the access pattern decision matrix →&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>dynamodb</category>
      <category>rds</category>
      <category>cloudarchitecture</category>
    </item>
    <item>
      <title>How to model Lambda cold-start behaviour under spike traffic before you deploy</title>
      <dc:creator>Abhishek Gupta</dc:creator>
      <pubDate>Thu, 23 Apr 2026 01:41:00 +0000</pubDate>
      <link>https://forem.com/abhishek_gupta_pinpo/how-to-model-lambda-cold-start-behaviour-under-spike-traffic-before-you-deploy-1g9c</link>
      <guid>https://forem.com/abhishek_gupta_pinpo/how-to-model-lambda-cold-start-behaviour-under-spike-traffic-before-you-deploy-1g9c</guid>
      <description>&lt;p&gt;There is a class of AWS incident I have started calling the "everything looked fine in testing" failure.&lt;/p&gt;

&lt;p&gt;The pattern is consistent. You design a serverless API. Lambda function with sensible defaults, wired through API Gateway, pointing at DynamoDB. You test it in dev throughout the week. Latency is acceptable. Costs track to plan. Then a campaign drops, or a new enterprise customer brings their three thousand users on day one, and your traffic goes from 300 RPS to 3,000 RPS in under a minute.&lt;/p&gt;

&lt;p&gt;Lambda, which has never had to spin up more than a dozen concurrent environments at once, is now being asked to handle a hundred. Cold starts accumulate. p99 latency goes from 80ms to 2,400ms. API Gateway timeout windows close on in-flight requests. Customers see errors. The Slack channel fires. You spend a Saturday explaining to your CTO why the architecture that "passed all our tests" just fell over under a load it should have anticipated.&lt;/p&gt;

&lt;p&gt;I have been in this situation. Not once.&lt;/p&gt;

&lt;p&gt;The second time is when I stopped treating load testing as a post-deployment activity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The cold start problem, precisely stated
&lt;/h2&gt;

&lt;p&gt;Lambda's execution model does not maintain persistent servers. When an invocation arrives and no warm execution environment exists, Lambda must provision one: select a host, initialise the execution environment, load the runtime, execute your module-level initialisation code.&lt;/p&gt;

&lt;p&gt;That sequence is the cold start. And its duration varies along several dimensions that are non-obvious:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Runtime matters.&lt;/strong&gt; Node.js 20 with V8: typically under 100ms for lightweight functions. Python: comparable. Java with JVM class-loading: 300ms to well over a second.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory allocation matters.&lt;/strong&gt; Lambda allocates CPU proportionally to memory. A function at 1,024 MB gets significantly more CPU than one at 128 MB. Counterintuitively, increasing memory can reduce cold start latency and total cost simultaneously - the faster initialisation more than compensates for the higher per-GB-second rate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The spike dynamic is what kills you.&lt;/strong&gt; Cold starts at steady state are manageable. The problem is spike behaviour. Under rapid traffic increase, Lambda must provision new environments in parallel. You can hit dozens or hundreds of concurrent cold starts at the exact moment your users' experience is most consequential. A steady-state load test does not expose this.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pre-deployment simulation model
&lt;/h2&gt;

&lt;p&gt;For the simulation, I built the architecture on a pinpole canvas:&lt;/p&gt;

&lt;p&gt;Route 53 → CloudFront → API Gateway → Lambda (Node.js 20, 512MB) → DynamoDB&lt;/p&gt;

&lt;p&gt;Lambda configured without provisioned concurrency - the common default for a new service with uncertain traffic. Reserved concurrency set explicitly, not at default unlimited.&lt;/p&gt;

&lt;p&gt;The Lambda config panel exposes the parameters that directly affect cold-start modelling:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Cold Start Relevance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;Node.js 20.x&lt;/td&gt;
&lt;td&gt;High - directly factors into latency model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory&lt;/td&gt;
&lt;td&gt;512 MB&lt;/td&gt;
&lt;td&gt;High - CPU allocation, init speed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reserved concurrency&lt;/td&gt;
&lt;td&gt;Set explicitly&lt;/td&gt;
&lt;td&gt;Critical - defines throttle ceiling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Provisioned concurrency&lt;/td&gt;
&lt;td&gt;Off&lt;/td&gt;
&lt;td&gt;The variable under test&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I ran a Spike traffic pattern: 300 RPS baseline → 3,000 RPS over 60 seconds. The concurrency graph showed cold-start accumulation in real time. At peak: 90 concurrent cold starts, 2,400ms p99 latency.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the simulation surfaced that a load test could not
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. The burst scaling limit.&lt;/strong&gt; Lambda's initial burst quota is 500–3,000 concurrent executions depending on region, then 500 new environments per minute thereafter. This is not visible in the Lambda console until you hit it. The simulation reflects these constraints - the concurrency graph under spike traffic is not a smooth ramp. It shows the actual burst behaviour, including the plateau and the recovery slope.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Timeout alignment.&lt;/strong&gt; The simulation flagged a configuration where API Gateway's integration timeout and Lambda's execution timeout were both set to 29 seconds. Under concurrency pressure, invocations that queue before executing can consume their window before execution even begins. Surface this in a canvas session: costs nothing. Discover it in a 2 AM incident: costs considerably more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. The provisioned concurrency trade-off, quantified.&lt;/strong&gt; I accepted the recommendation to enable provisioned concurrency, reran the simulation, and compared in the execution history view. p99 at peak: 80ms. The cost of provisioned concurrency was visible in the live estimate alongside the latency improvement. The trade-off was explicit and quantified before any IaC was written.&lt;/p&gt;

&lt;h2&gt;
  
  
  The reproducibility argument
&lt;/h2&gt;

&lt;p&gt;The result I value most is not the simulation output itself - it is that the output is reproducible and shareable. When I shared this analysis with my team, I shared the simulation run: the exact canvas configuration, the traffic pattern, the concurrency graph, the before-and-after comparison. Not an assertion about expected behaviour. A versioned record of what the model showed.&lt;/p&gt;

&lt;p&gt;That is a materially different quality of architectural evidence.&lt;/p&gt;

&lt;p&gt;Full post with complete simulation methodology, burst scaling model details, provisioned concurrency cost analysis, and the pre-deployment Lambda checklist →&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>lambda</category>
      <category>cloudarchitecture</category>
    </item>
    <item>
      <title>FinOps at design time: I found $3,840/month in avoidable spend before writing a line of Terraform</title>
      <dc:creator>Abhishek Gupta</dc:creator>
      <pubDate>Mon, 20 Apr 2026 10:51:00 +0000</pubDate>
      <link>https://forem.com/abhishek_gupta_pinpo/finops-at-design-time-i-found-3840month-in-avoidable-spend-before-writing-a-line-of-terraform-oip</link>
      <guid>https://forem.com/abhishek_gupta_pinpo/finops-at-design-time-i-found-3840month-in-avoidable-spend-before-writing-a-line-of-terraform-oip</guid>
      <description>&lt;p&gt;FinOps is almost entirely retrospective. AWS Cost Explorer tells you what happened last billing cycle. Trusted Advisor tells you which resources are underutilised right now. Cost anomaly alerts fire after the anomaly has already run for hours.&lt;/p&gt;

&lt;p&gt;Every tool in the standard FinOps stack analyses infrastructure that already exists. Which means by the time any of them are useful, the structural decisions that determine 80% of your architecture's lifetime cost have already been made, deployed, and are now expensive to reverse.&lt;/p&gt;

&lt;p&gt;I have been an AWS solutions architect for nine years. The pattern is consistent, and I have been complicit in it: design the architecture, write the IaC, deploy, and then discover the cost. The Pricing Calculator gives you a static estimate that assumes steady-state traffic and correct configuration. Neither assumption holds under a real workload.&lt;/p&gt;

&lt;p&gt;This post is about a session where I broke that pattern - and caught $3,840 per month in avoidable spend before a single resource was provisioned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The architecture&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Event processing pipeline for a Series B SaaS product. Customer activity events ingested via API, processed asynchronously, stored for downstream analytics. Expected baseline: 1,200 RPS, with a 6× spike on campaign days.&lt;br&gt;
Canvas topology in pinpole:&lt;/p&gt;

&lt;p&gt;Route 53 → API Gateway → Lambda (ingest) → SQS → Lambda (processor) → DynamoDB&lt;/p&gt;

&lt;p&gt;Lambda configured at 512 MB, reserved concurrency 200. DynamoDB in on-demand capacity mode. The AWS Pricing Calculator estimate at steady-state baseline: ~$4,100/month.&lt;/p&gt;

&lt;p&gt;Under a Constant simulation at 1,200 RPS, everything looked healthy. Cost settled at $4,230/month - close to the Pricing Calculator number, which felt like a good sign.&lt;/p&gt;

&lt;p&gt;Old workflow would have stopped there. Steady state is fine, cost is in range, proceed to deploy. pinpole's workflow does not stop there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding 1: DynamoDB on-demand at spike load&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I ran a Spike pattern at 7,200 RPS - the 6× campaign day load. The AI recommendations panel updated within seconds.&lt;/p&gt;

&lt;p&gt;The finding: DynamoDB on-demand at 7,200 RPS ingest, with 1.4× write amplification to a secondary index, was going to produce approximately $2,890/month in DynamoDB write costs alone on campaign days. Provisioned capacity with auto-scaling - minimum 1,500 WCU, maximum 12,000 WCU, target utilisation 70% - would bring that to approximately $740/month.&lt;br&gt;
The Pricing Calculator estimate had modelled DynamoDB at steady-state write volume. It had not accounted for the spike multiplier. The difference: $2,150/month per month from one configuration decision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding 2: Lambda memory allocation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The AI recommendation engine flagged that both Lambda functions at 512 MB were likely operating in a region of the memory/cost curve where increasing memory allocation reduces total compute cost despite the higher per-GB-second rate. The reason: execution duration drops non-linearly when CPU increases, because Lambda allocates CPU proportionally to memory.&lt;br&gt;
I accepted the recommendation to 1,024 MB, reran the simulation. Projected Lambda cost dropped. The configuration that performs better under load also costs less to run - that counterintuitive result does not surface in any static calculator.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding 3: No distribution layer in front of API Gateway&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Under spike load, API Gateway was absorbing the full request volume directly. Adding CloudFront to the canvas and rerunning showed that cacheable responses no longer hit the origin - API Gateway RPS at the ingest layer dropped meaningfully at peak, and the monthly API Gateway cost reduction offset the CloudFront cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The result&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB (campaign day)&lt;/td&gt;
&lt;td&gt;$2,890/mo&lt;/td&gt;
&lt;td&gt;$740/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda (both functions)&lt;/td&gt;
&lt;td&gt;Baseline&lt;/td&gt;
&lt;td&gt;Reduced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Gateway + CloudFront&lt;/td&gt;
&lt;td&gt;$X&lt;/td&gt;
&lt;td&gt;$X − delta&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total identified saving&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$3,840/mo&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All three findings identified before a deployment pipeline was touched. The post-deployment validation on the optimised configuration came in at $30 under the simulation projection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The broader point&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The dollar figure matters less than the mechanism. These are not obscure optimisations. DynamoDB capacity mode, Lambda memory right-sizing, and distribution layer decisions exist in almost every event-driven AWS architecture. They are routinely not caught until the first billing cycle - not because engineers are negligent, but because the tools required to catch them have historically required deployed infrastructure.&lt;/p&gt;

&lt;p&gt;That constraint is removable. The feedback loop that FinOps typically operates in - deploy, observe, optimise, redeploy - now has a step zero.&lt;br&gt;
Full post with simulation methodology, execution history, and the design-time FinOps checklist I now run on every new service →&lt;/p&gt;

&lt;p&gt;14-day Pro trial, no credit card. Free tier available at &lt;a href="https://dev.tourl"&gt;app.pinpole.cloud&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>finops</category>
      <category>cloudarchitecture</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
