<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mayaank Vadlamani</title>
    <description>The latest articles on Forem by Mayaank Vadlamani (@mayaankvad).</description>
    <link>https://forem.com/mayaankvad</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3507785%2F28baa4a7-c5ce-463b-b0fc-c03c8a560b77.jpg</url>
      <title>Forem: Mayaank Vadlamani</title>
      <link>https://forem.com/mayaankvad</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mayaankvad"/>
    <language>en</language>
    <item>
      <title>Lambda Explained: A Visual Journey from Init to Invoke</title>
      <dc:creator>Mayaank Vadlamani</dc:creator>
      <pubDate>Wed, 17 Sep 2025 02:05:41 +0000</pubDate>
      <link>https://forem.com/mayaankvad/lambda-explained-a-visual-journey-from-init-to-invoke-3cb8</link>
      <guid>https://forem.com/mayaankvad/lambda-explained-a-visual-journey-from-init-to-invoke-3cb8</guid>
      <description>&lt;h2&gt;
  
  
  Introduction: Why Lambda Behaves the Way It Does
&lt;/h2&gt;

&lt;p&gt;AWS Lambda has revolutionized how developers build and run applications by abstracting away server management and offering automatic scaling. But despite its popularity, many users do not fully understand some core behaviors — like why the first request to a Lambda function can take noticeably longer (the infamous cold start), or why each Lambda “instance” only handles one request at a time, even when multiple requests come in simultaneously.&lt;/p&gt;

&lt;p&gt;This article will explain how the Lambda runtime manages function execution using the &lt;strong&gt;Lambda Runtime API&lt;/strong&gt;, how the official Go SDK (&lt;a href="https://github.com/aws/aws-lambda-go" rel="noopener noreferrer"&gt;aws-lambda-go&lt;/a&gt;) interfaces with it, and why these details matter for performance and scaling.&lt;br&gt;
To make this tangible, I built a demo web app that simulates Lambda’s behavior by spawning isolated Go processes, each acting like a Firecracker microVM handling a single request. The app visualizes cold starts, warm invocations, and concurrent scaling on a timeline, so you can see Lambda’s internal workings in action.&lt;/p&gt;

&lt;p&gt;This article will focus on &lt;strong&gt;synchronous invocations&lt;/strong&gt; (e.g., API Gateway calling a Lambda and waiting for the response) rather than &lt;strong&gt;asynchronous invocations&lt;/strong&gt; (like S3 events or EventBridge, where events are queued and processed later). Synchronous invokes are where cold starts, latency, and concurrency limits are most directly visible to end users, making them the clearest way to understand how the Lambda runtime behaves.&lt;/p&gt;

&lt;p&gt;By the end, you’ll have a clearer picture of Lambda’s architecture, why cold starts happen, and how your functions scale — empowering you to write more efficient serverless applications.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🔗 The source code and demo are available on github at &lt;a href="https://github.com/mayaankvad/lambda-visualizer" rel="noopener noreferrer"&gt;mayaankvad/lambda-visualizer&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  The Lifecycle of a Lambda Execution Environment
&lt;/h2&gt;

&lt;p&gt;AWS Lambda runs your function code inside isolated execution environments called microVMs, built on Firecracker. Each environment goes through a lifecycle consisting of three main phases: &lt;strong&gt;Init&lt;/strong&gt;, &lt;strong&gt;Invoke&lt;/strong&gt;, and &lt;strong&gt;Shutdown&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Below is a high-level illustration of the &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtime-environment.html#cold-start-latency" rel="noopener noreferrer"&gt;lifecycle&lt;/a&gt;:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0imm2pu7c2pr683lo3pc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0imm2pu7c2pr683lo3pc.png" alt="Cold starts and latency" width="800" height="177"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This diagram shows how the Init phase prepares the execution environment, the Invoke phase runs your handler for each request, and the Shutdown phase cleans everything up. Understanding what happens in each phase helps explain Lambda behaviors like cold starts and how scaling works under the hood.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;👉 The official AWS documentation on &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtime-environment.html" rel="noopener noreferrer"&gt;Lambda Execution Environment Lifecycle&lt;/a&gt; provides an extensive technical breakdown, including diagrams and additional nuances.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Init Phase: Preparing the Execution Environment
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Init phase&lt;/strong&gt; occurs when a request comes in and there’s no existing warm environment ready to handle it — for example, the first invocation of your function or a burst in traffic requiring more concurrent instances. During Init, multiple AWS internal services work together to provision a secure, isolated environment on a worker host.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8di1vvtfgjqfhid6cu0a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8di1vvtfgjqfhid6cu0a.png" alt="Cold Synchronous Invoke" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here’s what happens step by step:&lt;/p&gt;

&lt;p&gt;1️⃣ &lt;strong&gt;Invoke request lands at the Frontend Load Balancers&lt;/strong&gt;, which distribute it across multiple Availability Zones (AZs) for high availability.&lt;/p&gt;

&lt;p&gt;2️⃣ The request goes to the &lt;strong&gt;Frontend Invoke Service&lt;/strong&gt;, which:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Authenticates and authorizes the caller to ensure only trusted requests reach your Lambda function.&lt;/li&gt;
&lt;li&gt;  Loads and caches metadata about your function, like its configuration and handler information.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;3️⃣ The Frontend then calls the &lt;strong&gt;Counting Service&lt;/strong&gt; to check concurrency quotas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  This service enforces your account limits, reserved concurrency settings, and burst capacity.&lt;/li&gt;
&lt;li&gt;  It is optimized for high throughput, low latency (under 1.5 ms) since it sits in the critical path of every request.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;4️⃣ Next, the &lt;strong&gt;Assignment Service&lt;/strong&gt; is hit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  It’s a stateful service that tracks all available execution environments.&lt;/li&gt;
&lt;li&gt;  For cold starts, it decides that a new execution environment must be created.&lt;/li&gt;
&lt;li&gt;  It coordinates the creation, assignment, and management of execution environments across the fleet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;5️⃣ The Assignment Service calls the &lt;strong&gt;Placement Service&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Placement uses machine learning models to choose an optimal worker host, balancing cold start performance with fleet efficiency.&lt;/li&gt;
&lt;li&gt;  Placement decisions also consider the health of worker hosts and avoid unhealthy nodes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;6️⃣ A &lt;strong&gt;Worker Host&lt;/strong&gt; spins up a &lt;strong&gt;Firecracker microVM&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  This microVM is a lightweight virtual machine that provides secure, fast-starting isolation for your function.&lt;/li&gt;
&lt;li&gt;  The worker downloads your deployment package or container image.&lt;/li&gt;
&lt;li&gt;  The microVM initializes the runtime (e.g., Go, Node, Python) and runs any initialization code outside the handler — such as global - variable setup or static resource creation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;7️⃣ The &lt;strong&gt;IAM role and environment variables&lt;/strong&gt; configured for your function are retrieved by the Assignment Service and securely passed to the worker host.&lt;/p&gt;

&lt;p&gt;8️⃣ Once the environment is fully initialized, the Assignment Service notifies the Frontend Invoke Service that the environment is ready.&lt;/p&gt;

&lt;p&gt;9️⃣ Finally, the Frontend sends the &lt;strong&gt;invoke payload&lt;/strong&gt; to the newly created execution environment, where your function handler executes.&lt;/p&gt;

&lt;p&gt;This Init phase explains why cold starts take longer than warm invocations: AWS must coordinate multiple control plane services, place your environment on a worker, spin up a microVM, and initialize your code before it can process the request.&lt;/p&gt;
&lt;h3&gt;
  
  
  Invoke Phase: Processing Requests in a Warm Environment
&lt;/h3&gt;

&lt;p&gt;Once the Init phase completes, the Lambda environment enters the &lt;strong&gt;Invoke phase&lt;/strong&gt;. Subsequent requests sent to your Lambda function &lt;em&gt;do not&lt;/em&gt; repeat the entire cold start process. Here’s what’s important:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  The Frontend Invoke Service checks with the Assignment Service, which determines whether there’s a &lt;strong&gt;warm and idle execution environment&lt;/strong&gt; available.&lt;/li&gt;
&lt;li&gt;  If there is, the payload is sent directly to the existing microVM, and your handler runs almost immediately — this is what makes warm invocations much faster.&lt;/li&gt;
&lt;li&gt;  After the handler completes, the worker notifies the Assignment Service it’s ready for another invocation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzpmjbakevj444ae5jmuu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzpmjbakevj444ae5jmuu.png" alt="Warm Synchronous Invoke" width="800" height="452"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Shutdown Phase: Cleaning Up
&lt;/h3&gt;

&lt;p&gt;When an execution environment has been idle for too long (usually 5–15 minutes) or AWS needs to reclaim resources, the Assignment Service marks the environment as unavailable. The microVM is then &lt;strong&gt;terminated&lt;/strong&gt;, freeing resources for other workloads. This keeps Lambda cost-effective by avoiding long-lived idle environments — unless you use &lt;strong&gt;provisioned concurrency&lt;/strong&gt;, which keeps environments warm proactively.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;👉 The whole lifecycle process was explained in more detail at &lt;a href="https://youtu.be/0_jfH6qijVY?t=671" rel="noopener noreferrer"&gt;re:Invent 2022: A closer look at AWS Lambda&lt;/a&gt;. This is an amazing must-watch resource.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  The Lambda Runtime API: How Lambdas Pull Work
&lt;/h2&gt;

&lt;p&gt;Once your execution environment is initialized, the &lt;strong&gt;runtime&lt;/strong&gt; — the component in the microVM responsible for running your function’s handler — needs a way to receive events and send results back to the Lambda service. This communication happens through the &lt;strong&gt;Lambda Runtime API&lt;/strong&gt;, which AWS provides as an HTTP endpoint unique to each execution environment. Although it often seems like Lambda pushes events directly to your function, the reality is that each execution environment &lt;strong&gt;pulls events&lt;/strong&gt; by polling the Runtime API — your function only runs after the environment actively requests the next event.&lt;/p&gt;

&lt;p&gt;Here’s a diagram illustrating how this works:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F31jxuqygy24s0lr3cmak.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F31jxuqygy24s0lr3cmak.png" alt="Runtime Loop" width="800" height="264"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  How the Runtime API Works: The Runtime Loop
&lt;/h3&gt;

&lt;p&gt;Each runtime runs a &lt;strong&gt;runtime loop&lt;/strong&gt;, where it repeatedly polls the Lambda Runtime API and processes events. Here’s how it works, step by step:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1️⃣ Poll for an event&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The runtime sends a &lt;code&gt;GET /2018-06-01/runtime/invocation/next&lt;/code&gt; request. This is a long-polling call that blocks until a new event is ready.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2️⃣ Process the event&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When the Runtime API responds, the runtime passes the event payload to your function’s handler code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3️⃣ Report the result&lt;/strong&gt;&lt;br&gt;
After the handler completes, the runtime sends the result to either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;code&gt;POST /runtime/invocation/{requestId}/response&lt;/code&gt; for a successful invocation.&lt;/li&gt;
&lt;li&gt;  &lt;code&gt;POST /runtime/invocation/{requestId}/error&lt;/code&gt; if an error occurred.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4️⃣ Repeat&lt;/strong&gt;&lt;br&gt;
The runtime goes back to step 1, polling for the next event until the environment is shut down.&lt;/p&gt;

&lt;p&gt;This loop continues for as long as the execution environment is warm, enabling repeated invocations without a cold start.&lt;/p&gt;
&lt;h3&gt;
  
  
  Custom Runtimes &amp;amp; Standard Runtimes
&lt;/h3&gt;

&lt;p&gt;AWS provides the*&lt;em&gt;Lambda Runtime API&lt;/em&gt;* so you can build your own custom runtimes — letting you run Lambda functions in languages not officially supported by AWS or customize how events are processed. The process for building a custom runtime is detailed in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom.html" rel="noopener noreferrer"&gt;Building a custom runtime for AWS Lambda&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/runtimes-api.html" rel="noopener noreferrer"&gt;Lambda Runtime API reference&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most use cases, AWS offers &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html" rel="noopener noreferrer"&gt;official runtimes&lt;/a&gt; for supported languages, like the &lt;a href="https://github.com/aws/aws-lambda-go" rel="noopener noreferrer"&gt;aws-lambda-go&lt;/a&gt; runtime for Go. This SDK implements the polling loop and Runtime API calls for you, so your handler can simply process events without worrying about the underlying mechanics.&lt;/p&gt;

&lt;p&gt;👉 We’ll dive deeper into the Go SDK later in this article to show exactly how this works and why it matters.&lt;/p&gt;
&lt;h3&gt;
  
  
  Why this matters
&lt;/h3&gt;

&lt;p&gt;Even though these Runtime API calls happen behind the scenes in standard runtimes, understanding that Lambda uses a &lt;strong&gt;pull-based execution model&lt;/strong&gt; is essential for predicting performance and avoiding surprises:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1️⃣ One request per instance&lt;/strong&gt;: Because each Lambda execution environment polls and processes events one at a time, a single environment cannot handle multiple concurrent requests. To scale to higher request volumes, AWS creates additional lambda execution environments as needed — each processing requests serially.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2️⃣ Scaling ≠ warming up a single instance&lt;/strong&gt;: Many developers assume that as load increases, Lambda will keep a few instances “warm” and simply feed them more events. In reality, each new concurrent request beyond the current warm capacity causes Lambda to spin up a new environment. Under sudden traffic spikes (e.g., bursty API Gateway usage), this can mean dozens or hundreds of cold starts in parallel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3️⃣ Concurrency limits&lt;/strong&gt;: Your AWS account has a &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html#compute-and-storage" rel="noopener noreferrer"&gt;regional concurrency limit&lt;/a&gt; (default 1,000 concurrent executions per region) shared across all functions. Hitting this limit will cause invocations to be throttled, so understanding how quickly Lambda scales out helps you plan capacity and avoid unexpected 429 errors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4️⃣ Cold start impact&lt;/strong&gt;: Since each new environment must go through the Init phase, bursts of new concurrent requests can lead to many cold starts, which may add latency to requests and degrade the user experience.&lt;/p&gt;

&lt;p&gt;A common example of where this behavior surprises developers is building APIs with &lt;strong&gt;API Gateway + Lambda&lt;/strong&gt;. When a sudden spike of requests hits your API Lambda will rapidly spin up new execution environments for each concurrent request beyond your current warm capacity. Each of these new environments incurs a &lt;strong&gt;cold start&lt;/strong&gt;, adding latency to those initial requests. This cold start behavior often isn’t apparent in development, where requests are typically made one after another rather than in large bursts.&lt;/p&gt;
&lt;h3&gt;
  
  
  Why One Request at a Time?
&lt;/h3&gt;

&lt;p&gt;One of the most important architectural details of Lambda — and something often misunderstood — is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Each Lambda execution environment can only process one request at a time.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Even though Lambda scales out across multiple instances, &lt;strong&gt;individual Lambda environments are strictly single-request&lt;/strong&gt;. This design is intentional:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  It ensures consistent, isolated execution for every request.&lt;/li&gt;
&lt;li&gt;  There’s no risk of concurrent shared-state corruption inside your function.&lt;/li&gt;
&lt;li&gt;  It allows the environment to be stateless, disposable, and parallelizable.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Demo &amp;amp; Timeline View: Visualizing Lambda Behavior
&lt;/h2&gt;

&lt;p&gt;To bring these Lambda execution principles to life, I built a web application that simulates AWS Lambda’s request lifecycle. This interactive demo illustrates how cold starts, concurrency, and warm invocations actually play out in practice—something that’s often hidden behind the scenes in typical development workflows.&lt;/p&gt;

&lt;p&gt;The simulator uses lightweight Go processes to emulate Lambda’s behavior, with each process acting as a single, isolated execution environment. Here’s a walkthrough of what you’ll see when you use the demo.&lt;/p&gt;
&lt;h3&gt;
  
  
  1️⃣ Invoke a single function.
&lt;/h3&gt;

&lt;p&gt;When you send a single invocation, the simulator behaves exactly like a cold start. A new Go process is spun up, just as Lambda would create a new microVM. The timeline view shows a new bar representing this new instance. The console logs will reflect the entire lifecycle:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc5ulth4iq4yd18e050t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc5ulth4iq4yd18e050t.png" alt="Sync Invoke Page" width="800" height="410"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;pre&gt;
&lt;span&gt;info:&lt;/span&gt; &lt;span&gt;💻 User Called Sync Invoke. Request ID: 26dab387-6a2a-4f88-8cfa-05ed68c5abf3&lt;/span&gt;
&lt;span&gt;info:&lt;/span&gt; &lt;span&gt;&lt;strong&gt;🚀 Lambda Runtime Created: PID 27949&lt;/strong&gt;&lt;/span&gt;
&lt;span&gt;info:&lt;/span&gt; &lt;span&gt;⌛ Lambda Runtime Polled For Event&lt;/span&gt;
&lt;span&gt;debug:&lt;/span&gt; &lt;span&gt;Sending event to Lambda Runtime 26dab387-6a2a-4f88-8cfa-05ed68c5abf3&lt;/span&gt;
&lt;span&gt;debug:&lt;/span&gt; &lt;span&gt;[PID: 27949] stderr: 2025/09/04 20:03:38 INFO PID 27949 received event. AwsRequestID: 26dab387-6a2a-4f88-8cfa-05ed68c5abf3&lt;/span&gt;
&lt;span&gt;info:&lt;/span&gt; &lt;span&gt;✅ Lambda Runtime reported success. Request ID: 26dab387-6a2a-4f88-8cfa-05ed68c5abf3&lt;/span&gt;
&lt;span&gt;info:&lt;/span&gt; &lt;span&gt;⌛ Lambda Runtime Polled For Event&lt;/span&gt;
&lt;/pre&gt;
&lt;h3&gt;
  
  
  2️⃣ Send multiple concurrent invocations.
&lt;/h3&gt;

&lt;p&gt;Now, let's simulate a traffic spike by sending five concurrent invocations, with each lasting a different duration. Just like in a real Lambda environment, you interact with a simple web interface to send these simulated invocations. For each incoming request, the simulator spins up a new, isolated Go process. This faithfully mimics how Lambda uses lightweight microVMs for each instance.&lt;/p&gt;

&lt;p&gt;Since each simulated process, like a real Lambda function, handles only one request at a time, the simulator must launch additional instances to handle the load. When multiple requests arrive in parallel or while previous ones are still running, the simulator triggers a new cold start for each one. This visually demonstrates how concurrency causes Lambda to scale out by creating new, isolated instances, each one incurring its own cold start.&lt;/p&gt;

&lt;p&gt;On the timeline, you’ll see three distinct rows, each representing a new Lambda instance. The console logs will show this process happening in parallel, with multiple Lambda Runtime Created entries and corresponding Polled For Event entries for each new instance. This behavior is a foundational concept of Lambda and is often counterintuitive to developers accustomed to multithreaded environments. The demo makes this abstract concept tangible by clearly showing that a new instance is spun up for each concurrent request, rather than an existing instance handling multiple events at once.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1z4abqsnbkzqfcw4itzh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1z4abqsnbkzqfcw4itzh.png" alt="Invocation Timeline" width="800" height="194"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🛠️ Try it yourself: &lt;br&gt; &lt;code&gt;docker run -p 8000:8000 -it ghcr.io/mayaankvad/lambda-visualizer:latest&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Here is an example from the AWS Documentation:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkp6gti2zv814h8oxk9c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkp6gti2zv814h8oxk9c.png" alt="Lambda concurrency diagram" width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-concurrency.html" rel="noopener noreferrer"&gt;lambda concurrency&lt;/a&gt; documentation describes in detai how to calculate concurrency and visualize the two main concurrency control options (reserved and provisioned), estimate appropriate concurrency control settings, and view metrics for further optimization.&lt;/p&gt;
&lt;h2&gt;
  
  
  Deep Dive AWS Lambda Go
&lt;/h2&gt;

&lt;p&gt;Now that we’ve visualized how Lambda behaves at the infrastructure level, let’s look at the code that powers each Lambda instance in our demo — specifically how the &lt;a href="https://github.com/aws/aws-lambda-go" rel="noopener noreferrer"&gt;aws-lambda-go&lt;/a&gt; runtime works under the hood.&lt;/p&gt;
&lt;h3&gt;
  
  
  Why Go?
&lt;/h3&gt;

&lt;p&gt;Most Lambda tutorials and production code you’ll find use Python, JavaScript, or Java — all of which hide the complexity of the Lambda Runtime API behind a simple handler declaration like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Behind the scenes, AWS runs a prebuilt runtime for those languages that automatically handles polling for events, decoding input, running your handler, and reporting results — all invisible to you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Go is different&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Go doesn’t use an interpreted runtime or metadata-based handler mapping. It compiles to a static binary, and as such, &lt;strong&gt;you can’t just tell Lambda, "run this function."&lt;/strong&gt; Instead, AWS provides an actual Go package — &lt;a href="https://github.com/aws/aws-lambda-go" rel="noopener noreferrer"&gt;aws-lambda-go/lambda&lt;/a&gt; — that your program must explicitly import and run. That package &lt;strong&gt;implements the entire Lambda Runtime API interaction&lt;/strong&gt;: polling for requests, decoding JSON input, invoking your handler, and posting results back.&lt;/p&gt;

&lt;p&gt;This means Go gives us a &lt;strong&gt;front-row seat&lt;/strong&gt; to how Lambda actually works behind the curtain — making it the perfect language for our simulation.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is aws-lambda-go?
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://github.com/aws/aws-lambda-go" rel="noopener noreferrer"&gt;aws-lambda-go&lt;/a&gt; SDK is the official AWS-provided library for running Go-based Lambda functions. Its core package (lambda) contains logic that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Initializes the function runtime.&lt;/li&gt;
&lt;li&gt;Connects to the internal Lambda Runtime API.&lt;/li&gt;
&lt;li&gt;Enters the runtime loop&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Unlike other languages, in Go, &lt;strong&gt;your code is responsible for initiating and maintaining this lifecycle&lt;/strong&gt;. Your &lt;code&gt;main()&lt;/code&gt; function calls &lt;code&gt;lambda.Start(handler)&lt;/code&gt; — and from there, the SDK continuously manages communication with the Lambda Service.&lt;/p&gt;

&lt;p&gt;That means your Go binary is what “boots” the Lambda instance, pulls the event, runs your logic, and reports the response. This matches our simulation closely: each Go process in the demo acts like a full Lambda runtime in miniature. By stepping into the SDK code, we can see the interaction with the runtime API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Breaking Down the Lambda Go Runtime
&lt;/h3&gt;

&lt;p&gt;Let’s examine how &lt;code&gt;lambda.Start&lt;/code&gt; actually works. This function is the entry point for any Go-based Lambda function and is responsible for initiating the polling loop that communicates with the Lambda Runtime API.&lt;/p&gt;

&lt;p&gt;Your function must include a &lt;code&gt;main()&lt;/code&gt; method that calls &lt;code&gt;lambda.Start&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;lambda&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;YourHandlerFunc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is where the Lambda Go SDK takes over. Let’s walk through what happens next.&lt;/p&gt;

&lt;h4&gt;
  
  
  Start
&lt;/h4&gt;

&lt;p&gt;The public Start function sets up some defaults and then invokes an internal start function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/aws-lambda-go/blob/42a01a9d1f01a6218e10ab874fa50ed1a3dc0ef9/lambda/entry.go#L99-L113" rel="noopener noreferrer"&gt;lambda/entry.go#L99-L113&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;handlerOptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;startFunctions&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c"&gt;// in normal operation, the start function never returns&lt;/span&gt;
            &lt;span class="c"&gt;// if it does, exit!, this triggers a restart of the lambda function&lt;/span&gt;
            &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;logFatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;logFatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"expected AWS Lambda environment variables %s are not defined"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This loop checks for required environment variables and invokes the corresponding startup function. If no environment variable matches, it logs a fatal error.&lt;/p&gt;

&lt;h4&gt;
  
  
  startFunctions
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/aws-lambda-go/blob/main/lambda/entry.go#L77-L86" rel="noopener noreferrer"&gt;lambda/entry.go#L77-L86&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;runtimeAPIStartFunction&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;startFunction&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"AWS_LAMBDA_RUNTIME_API"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;startRuntimeAPILoop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;startFunctions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;startFunction&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;runtimeAPIStartFunction&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// This allows end to end testing of the Start functions, by tests overwriting this function to keep the program alive&lt;/span&gt;
    &lt;span class="n"&gt;logFatalf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The only defined startup function here is &lt;code&gt;startRuntimeAPILoop&lt;/code&gt;, which is used when running in the AWS Lambda environment.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ℹ️ At launch both RPC and HTTP start functions were included here. Now, RPC is added conditionally via build tags in &lt;a href="https://github.com/aws/aws-lambda-go/blob/42a01a9d1f01a6218e10ab874fa50ed1a3dc0ef9/lambda/rpc_function.go#L22-L33" rel="noopener noreferrer"&gt;rpc_function.go&lt;/a&gt; to reduce bloat. See: &lt;a href="https://github.com/aws/aws-lambda-go/pull/456" rel="noopener noreferrer"&gt;PR #456&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  startRuntimeAPILoop
&lt;/h4&gt;

&lt;p&gt;This is the core loop that powers the Lambda runtime for Go.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/aws-lambda-go/blob/42a01a9d1f01a6218e10ab874fa50ed1a3dc0ef9/lambda/invoke_loop.go#L31-L43" rel="noopener noreferrer"&gt;lambda/invoke_loop.go#L31-L43&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// startRuntimeAPILoop will return an error if handling a particular invoke resulted in a non-recoverable error&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;startRuntimeAPILoop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;newRuntimeAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;newHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;handleInvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This loop is critical: it pulls the next event from the Lambda Runtime API (&lt;code&gt;client.next&lt;/code&gt;), then passes it to your handler via &lt;code&gt;handleInvoke&lt;/code&gt;. Each call is blocking and sequential. There is no goroutine usage here — which confirms that only one request is processed at a time per Lambda instance. This enforces the “one event per environment” model, where concurrency is handled by spawning new instances, not threads.&lt;/p&gt;

&lt;h4&gt;
  
  
  handleInvoke
&lt;/h4&gt;

&lt;p&gt;This function prepares the context and invokes your handler function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/aws-lambda-go/blob/42a01a9d1f01a6218e10ab874fa50ed1a3dc0ef9/lambda/invoke_loop.go#L46-L66" rel="noopener noreferrer"&gt;lambda/invoke_loop.go#L46-L66&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// handleInvoke returns an error if the function panics, or some other non-recoverable error occurred&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;handleInvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;handlerOptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="err"&gt;…&lt;/span&gt;

&lt;span class="c"&gt;// set the invoke metadata values&lt;/span&gt;
    &lt;span class="n"&gt;lc&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;lambdacontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LambdaContext&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;AwsRequestID&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;       &lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;InvokedFunctionArn&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;headerInvokedFunctionARN&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lambdacontext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;lc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="err"&gt;…&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This builds the &lt;code&gt;context.Context&lt;/code&gt; passed to your handler, populated with metadata like &lt;code&gt;AwsRequestID&lt;/code&gt; and the ARN of the function.&lt;/p&gt;

&lt;h4&gt;
  
  
  Calling the Handler
&lt;/h4&gt;

&lt;p&gt;The user-defined handler function is then executed:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/aws-lambda-go/blob/42a01a9d1f01a6218e10ab874fa50ed1a3dc0ef9/lambda/invoke_loop.go#L68-L75" rel="noopener noreferrer"&gt;lambda/invoke_loop.go#L68-L75&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;    &lt;span class="c"&gt;// call the handler, marshal any returned error&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;invokeErr&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;callBytesHandlerFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;handlerFunc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside callBytesHandlerFunc, the SDK safeguards against panics and formats the output properly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;callBytesHandlerFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="n"&gt;handlerFunc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Reader&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;invokeErr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InvokeResponse_Error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;recover&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;invokeErr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lambdaPanicResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lambdaErrorResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures that any uncaught panics or errors are reported cleanly back to the Lambda service using a structured error response.&lt;/p&gt;

&lt;h4&gt;
  
  
  Reporting Success
&lt;/h4&gt;

&lt;p&gt;After the handler completes successfully, the result is sent back to Lambda:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/aws/aws-lambda-go/blob/42a01a9d1f01a6218e10ab874fa50ed1a3dc0ef9/lambda/invoke_loop.go#L97-L99" rel="noopener noreferrer"&gt;lambda/invoke_loop.go#L97-L99&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;handleInvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;handlerOptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="err"&gt;…&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;contentType&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"unexpected error occurred when sending the function functionResponse to the API: %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This completes one full invocation cycle. Immediately afterward, the runtime loop continues and requests the next event:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;startRuntimeAPILoop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="err"&gt;…&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;handleInvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This steps through the entire runtime loop.&lt;/p&gt;

&lt;h4&gt;
  
  
  Only One Event at a Time — No Goroutines
&lt;/h4&gt;

&lt;p&gt;Nowhere in this code path are goroutines used. All calls to &lt;code&gt;client.next()&lt;/code&gt; and &lt;code&gt;handleInvoke()&lt;/code&gt; are synchronous and blocking. The SDK does not spin off new goroutines per invocation — meaning each Lambda instance handles only one event at a time, fully serialized.&lt;/p&gt;

&lt;p&gt;If multiple events arrive concurrently, the Lambda service will spin up multiple instances — each running its own runtime loop — rather than handling them concurrently in the same instance.&lt;/p&gt;

&lt;p&gt;This behavior is foundational to understanding concurrency, isolation, and cold starts in AWS Lambda.&lt;/p&gt;

&lt;h4&gt;
  
  
  Handler Reuse and Initialization Best Practices
&lt;/h4&gt;

&lt;p&gt;One important takeaway from this runtime architecture: only your &lt;strong&gt;handler function&lt;/strong&gt; is invoked for each event — not the entire module or binary.&lt;br&gt;
Your Lambda is initialized &lt;strong&gt;once per cold start&lt;/strong&gt;, and then reused across multiple invocations. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Global variables, &lt;code&gt;init()&lt;/code&gt; functions, and any setup outside the handler run only once — at cold start.&lt;/li&gt;
&lt;li&gt;  Code inside your handler runs &lt;code&gt;per invocation&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the runtime code, you can see this in how &lt;code&gt;newHandler(handler)&lt;/code&gt; is constructed once in &lt;code&gt;startRuntimeAPILoop&lt;/code&gt;, and reused repeatedly inside the loop.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;newHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// initialized once&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;handleInvoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// handler reused&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes it critical to &lt;strong&gt;avoid putting initialization logic inside your handler function&lt;/strong&gt;. Doing so would repeat expensive setup (e.g., opening database connections) for every request. Instead, place that logic in the module scope or &lt;code&gt;init()&lt;/code&gt; function, so it only runs during the cold start phase.&lt;/p&gt;

&lt;p&gt;In summary, the &lt;a href="https://github.com/aws/aws-lambda-go" rel="noopener noreferrer"&gt;aws-lambda-go&lt;/a&gt; runtime provides a transparent and straightforward implementation of how Lambda functions operate under the hood. By reading the actual code, we see clearly that each Lambda instance handles only one request at a time, synchronously, without goroutines or parallel execution. This reinforces why concurrency is managed at the infrastructure level, not the application level. Understanding these mechanics enables you to write more efficient, predictable Lambda functions — especially when it comes to handling initialization, state, and cold starts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Understanding what happens behind the scenes is the key to writing better, more performant serverless applications. By exploring the Lambda Runtime API and how the Go SDK implements it, we’ve learned that a Lambda function's lifecycle is more complex than just a simple handler. The pull-based model, single-request environments, and the clear distinction between initialization and invocation are foundational concepts that empower you to anticipate cold starts and manage concurrent traffic more effectively.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you’d like to see these concepts in action, check out the full source code for the simulator used in this article. You can find the repository here: &lt;a href="https://github.com/mayaankvad/lambda-visualizer" rel="noopener noreferrer"&gt;mayaankvad/lambda-visualizer&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Sources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://www.youtube.com/watch?v=0_jfH6qijVY" rel="noopener noreferrer"&gt;re:Invent SVS404-R video&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html" rel="noopener noreferrer"&gt;Lambda Runtimes&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtime-environment.html" rel="noopener noreferrer"&gt;Lambda Execution Environment Lifecycle&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/runtimes-api.html" rel="noopener noreferrer"&gt;Runtime API&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom.html" rel="noopener noreferrer"&gt;Building custom runtimes&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-concurrency.html" rel="noopener noreferrer"&gt;Understanding Lambda function scaling&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>lambda</category>
      <category>go</category>
    </item>
  </channel>
</rss>
