<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Pubudu Jayawardana</title>
    <description>The latest articles on Forem by Pubudu Jayawardana (@pubudusj).</description>
    <link>https://forem.com/pubudusj</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F353213%2F355626e5-6081-4cf4-8e50-d8783b1da6ff.jpeg</url>
      <title>Forem: Pubudu Jayawardana</title>
      <link>https://forem.com/pubudusj</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/pubudusj"/>
    <language>en</language>
    <item>
      <title>Understanding Lambda Tenant Isolation</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Sun, 22 Feb 2026 23:33:54 +0000</pubDate>
      <link>https://forem.com/aws-builders/understanding-lambda-tenant-isolation-4kdc</link>
      <guid>https://forem.com/aws-builders/understanding-lambda-tenant-isolation-4kdc</guid>
      <description>&lt;p&gt;Lambda tenant isolation is one of the important security features that came out of the 2025 re:Invent season.&lt;/p&gt;

&lt;p&gt;Achieving tenant isolation in SaaS applications is not straightforward, and taking the single-tenant route to solve it introduces its own scaling challenges. This new feature is not a silver bullet, but it does offer much better support for keeping tenants isolated at scale.&lt;/p&gt;

&lt;p&gt;In this blog post, I discuss what this feature is and the problems it addresses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lambda execution environment
&lt;/h2&gt;

&lt;p&gt;When a invoke request reached AWS Lambda service initially, it starts a virtual environment in a EC2 host worker. We call this an execution environment. An execution environment will download the code and required dependencies and process the request. If required, it will return the response.&lt;/p&gt;

&lt;p&gt;One of the key attributes of this execution environment is that it will not be removed or deleted immediately after processing a single request. It will be kept in 'warm' state to serve another incoming request. When the next request comes in, Lambda service will use the execution environment that is already available to process it, without creating a new environment.&lt;/p&gt;

&lt;p&gt;Likewise, when you invoke a Lambda function, if there are execution environments available for that Lambda function, Lambda service will use them to process the requests else will create new execution environments.&lt;/p&gt;

&lt;p&gt;However, this approach can be a concern when it comes to a multi-tenant Lambda function, because execution environments share 'left over' stuff like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Global variables&lt;/li&gt;
&lt;li&gt;Objects initialized outside of the handler&lt;/li&gt;
&lt;li&gt;files saved in /tmp space&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Multi-tenant Lambda function
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2q16t958g395l2m1228c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2q16t958g395l2m1228c.png" alt="Image: Multi-tenant Lambda function" width="800" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this multi-tenant setup, a Lambda function is shared by more than one tenant. And based on how execution environments behave, irrespective of the tenant that invoke the Lambda function, Lambda service will use execution environments that are already available for that Lambda function. But, as mentioned earlier, execution environments sharing some data across executions can be a security issue.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Having data shared across execution environments can be a great optimization if those data are accessible only by the intended tenant. However, when multiple tenants use the same execution environments, tenants will have access to data that they are not intended to.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For example, if we take a single execution environment, tenant 1 might fetch some secrets from Secret Manager or save some files in /tmp directory in its execution. If the same execution environments used for an execution of tenant 2, tenant 2 will have access to the secrets fetched for tenant 1 or the contents of the /tmp directory saved by tenant 1.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution
&lt;/h3&gt;

&lt;p&gt;One of the solution for this problem is to reset the execution environment just before processing each request. For example, unsetting any global variables or wipe the /tmp directory etc. However, this approach will not be practical at scale.&lt;/p&gt;

&lt;p&gt;Another option is to go for tenant specific Lambda functions which is the single-tenant approach. In this case, tenants will have their own dedicated Lambda function each. This will solve the problem of unintended access to temporary data, because different execution environments belongs to different Lambada functions will not share the execution environments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8yov989thxkq9z7rsbo4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8yov989thxkq9z7rsbo4.png" alt="Image: Single-tenant Lambda functions" width="800" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However having Lambda function per tenant is not scalable. When there are a lot of tenants available, you end up with a lot of Lambda functions that you need to manage. While this is possible when you use a IAC tool like CDK or Terraform, still in a situation where you need to update source code to introduce a functionality or a bug fix, you will need to update all these tenant specific resources which is not easy. And, most of the time, those resources are not fully utilized too.&lt;/p&gt;

&lt;p&gt;What if we can have a single Lambda function shared by all of the tenant (the multi-tenant approach), yet have the isolation we need in a Lambda per tenant (single-tenant approach)?&lt;/p&gt;

&lt;h2&gt;
  
  
  Lambda tenant isolation
&lt;/h2&gt;

&lt;p&gt;With Lambda tenant isolation, you can have a single Lambda function shared by all of the tenant, yet have the isolation you need in a Lambda per tenant. With this new feature, Lambda service will do the heavy lifting by creating execution environments that are dedicated to a specific tenant. This means that execution environments will not be shared across tenants, and tenants will not have access to data that they are not intended to.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqdj7msvdewauh18dah69.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqdj7msvdewauh18dah69.png" alt="Image: Tenant isolated Lambda function" width="800" height="382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But, how Lambda determines the incoming request is from which tenant? For that, we need to provide a &lt;strong&gt;tenant-id&lt;/strong&gt; in the request to Lambda.&lt;br&gt;
If you use Lambda invoke cli method, you can use the &lt;code&gt;--tenant-id&lt;/code&gt; parameter as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda invoke &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--function-name&lt;/span&gt; tenant-aware-lambda &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--payload&lt;/span&gt; &lt;span class="s1"&gt;'{ "name": "Bob" }'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tenant-id&lt;/span&gt; t1 &lt;span class="se"&gt;\&lt;/span&gt;
    response.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you use Lambda API, you need to provide the value using &lt;code&gt;X-Amz-Tenant-Id&lt;/code&gt; as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="nf"&gt;POST&lt;/span&gt; &lt;span class="nn"&gt;/2015-03-31/functions/tenant-aware-lambda/invocations&lt;/span&gt; &lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;lambda.eu-central-1.amazonaws.com&lt;/span&gt;
&lt;span class="na"&gt;Content-Type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/json&lt;/span&gt;
&lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS4-HMAC-SHA256 Credential=...&lt;/span&gt;
&lt;span class="na"&gt;X-Amz-Tenant-Id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;t1&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bob"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tenant id&lt;/strong&gt; is case sensitive and it can be any alpha numeric character with maximum length 256 characters. There are few special characters such as hyphens (-), underscores (_), colon (:), equals (=), plus (+), at (@) and periods (.) allowed too.&lt;/p&gt;

&lt;p&gt;One of the key attributes of tenant id is that we don't need to pre-register those tenant ids. We can pass any dynamic value as the tenant id and Lambda service will take care of creating and maintaining a pool of execution environments for each value passed. Also, we can use any number of unique tenant ids, there is no limit.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to enable this feature in Lambda
&lt;/h3&gt;

&lt;p&gt;If you use the AWS Console, you can go to the Lambda create wizard and enable this in the additional security section.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2hpjwx3r9uo70optiu0a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2hpjwx3r9uo70optiu0a.png" alt="Image: Creating Lambda function in AWS console" width="800" height="233"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Also, options for CLI, CloudFormation and CDK also available.&lt;/p&gt;

&lt;p&gt;CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws lambda create-function &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--function-name&lt;/span&gt; tenant-aware-lambda &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--runtime&lt;/span&gt; python3.14 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--zip-file&lt;/span&gt; fileb://tenant-aware-lambda.zip &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--handler&lt;/span&gt; index.handler &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--role&lt;/span&gt; arn:aws:iam:123456789012:role/execution-role &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tenancy-config&lt;/span&gt; &lt;span class="s1"&gt;'{"TenantIsolationMode": "PER_TENANT"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CloudFormation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;MyLambdaFunction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Lambda::Function&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;FunctionName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;tenant-aware-lambda&lt;/span&gt;
      &lt;span class="na"&gt;Runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python3.14&lt;/span&gt;
      &lt;span class="na"&gt;Role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;LambdaExecutionRole.Arn&lt;/span&gt;
      &lt;span class="na"&gt;Handler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;index.handler&lt;/span&gt;
      &lt;span class="na"&gt;TenancyConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;TenantIsolationMode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PER_TENANT&lt;/span&gt;
      &lt;span class="na"&gt;Code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;ZipFile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;.....&lt;/span&gt;
          &lt;span class="s"&gt;.....&lt;/span&gt;
      &lt;span class="na"&gt;Timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
      &lt;span class="na"&gt;MemorySize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;128&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CDK:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tenant_aware_lambda&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_lambda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TenantAwareFunction&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;function_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenant-aware-lambda&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;runtime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_lambda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PYTHON_3_13&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;index.handler&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_lambda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Code&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_asset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;src/lambda/tenant_aware&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;tenancy_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_lambda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TenancyConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PER_TENANT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Please note: This feature can be enabled ONLY when the Lambda function is created. You cannot enable this for an existing Lambda function.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Try this yourself
&lt;/h3&gt;

&lt;p&gt;I have created a sample application for you to see how this feature works. You can deploy it to your AWS environment using CDK with Python.&lt;/p&gt;

&lt;p&gt;Clone the repository at &lt;a href="https://github.com/pubudusj/lambda-tenant-isolation-demo" rel="noopener noreferrer"&gt;github.com/pubudusj/lambda-tenant-isolation-demo&lt;/a&gt; and follow the steps below:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a virtual environment and activate it:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv
&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Install the dependencies:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Deploy the stack:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cdk deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create two Lambda functions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A generic Lambda function (without tenant isolation)&lt;/li&gt;
&lt;li&gt;A Lambda function with tenant isolation enabled&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Both Lambda functions have a global variable &lt;code&gt;counter&lt;/code&gt; which increments on each invocation. This is to simulate the shared state across executions. An API Gateway is also created with two endpoints to trigger these Lambda functions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/execute_generic_lambda?tenant_id=&amp;lt;tenant_id&amp;gt;&lt;/code&gt; - invokes the generic Lambda function&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/execute_tenant_aware_lambda?tenant_id=&amp;lt;tenant_id&amp;gt;&lt;/code&gt; - invokes the tenant-aware Lambda function&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Testing
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Generic Lambda function
&lt;/h4&gt;

&lt;p&gt;First, let's test the generic Lambda function. Call the &lt;code&gt;/execute_generic_lambda&lt;/code&gt; endpoint with a tenant id:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"&amp;lt;APIGW_BASE_URL&amp;gt;/execute_generic_lambda?tenant_id=tenant1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will see a response like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"tenant_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"invocation_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now call the same endpoint again but with a different tenant id:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"&amp;lt;APIGW_BASE_URL&amp;gt;/execute_generic_lambda?tenant_id=tenant2"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see the invocation count keeps increasing regardless of the tenant id:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"tenant_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"invocation_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is because the generic Lambda function shares execution environments across all tenants. The global variable &lt;code&gt;counter&lt;/code&gt; retains its value across invocations regardless of which tenant made the request. This is the exact problem we discussed earlier - any global state, cached data or files in &lt;code&gt;/tmp&lt;/code&gt; are accessible across tenants.&lt;/p&gt;

&lt;h4&gt;
  
  
  Tenant-aware Lambda function
&lt;/h4&gt;

&lt;p&gt;Now let's test the tenant-aware Lambda function. Call the &lt;code&gt;/execute_tenant_aware_lambda&lt;/code&gt; endpoint with a tenant id:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"&amp;lt;APIGW_BASE_URL&amp;gt;/execute_tenant_aware_lambda?tenant_id=tenant1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will see a response like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"tenant"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"invocation_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now call the same endpoint again with a different tenant id:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"&amp;lt;APIGW_BASE_URL&amp;gt;/execute_tenant_aware_lambda?tenant_id=tenant2"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This time, the invocation count resets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"tenant"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"invocation_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is tenant isolation in action. Even though both requests are handled by the same Lambda function, Lambda service creates separate execution environments for each tenant. The global variable &lt;code&gt;counter&lt;/code&gt;, or any other shared state in the execution environment, is isolated per tenant. Tenant 2 will never see the state left behind by Tenant 1.&lt;/p&gt;

&lt;p&gt;Also note that in the tenant-aware Lambda function, we can access the tenant id from the Lambda context object (ex: using &lt;code&gt;context.tenant_id&lt;/code&gt; in Python), instead of extracting it from the query parameters. Lambda service automatically makes the tenant id available in the context when tenant isolation is enabled. This is useful if you need to do some operations based on the tenant id - for example, fetch some data from other services.&lt;/p&gt;

&lt;p&gt;In this example, I have used this tenant id available in the context object to publish a custom CloudWatch metric per tenant. This is helpful to monitor per-tenant invocation patterns, which can be useful for billing or auditing purposes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Effect on Lambda concurrency and cold starts
&lt;/h2&gt;

&lt;p&gt;One important thing to understand is how tenant isolation affects Lambda concurrency. Since execution environments are not shared across tenants, each tenant will need their own set of execution environments. This means that the overall number of concurrent execution environments can be higher compared to a non-isolated Lambda function where environments are freely shared.&lt;/p&gt;

&lt;p&gt;For example, if you have 10 tenants each making concurrent requests, instead of reusing a pool of warm execution environments, Lambda needs to maintain separate pools per tenant. This can lead to &lt;strong&gt;more cold starts&lt;/strong&gt;, especially for low-traffic tenants.&lt;/p&gt;

&lt;p&gt;However, this is a trade-off worth making when tenant isolation is critical for your application.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Please note:&lt;/strong&gt; Make sure to consider the Lambda concurrency limits in your account when enabling tenant isolation, especially when dealing with a large number of tenants.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Integration with API Gateway
&lt;/h2&gt;

&lt;p&gt;As at now, only integration supports Lambda tenant isolation features is API gateway.&lt;/p&gt;

&lt;p&gt;In the example project, I have mapped the incoming query string &lt;code&gt;tenant_id&lt;/code&gt; to the integration request header &lt;code&gt;X-Amz-Tenant-Id&lt;/code&gt; for the Lambda service.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tenant_aware_integration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;apigw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;LambdaIntegration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tenant_aware_lambda&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;request_parameters&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;integration.request.header.X-Amz-Tenant-Id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;method.request.querystring.tenant_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When using Lambda integration on API Gateway, it gives us a lot of flexibility in choosing what value to map as the &lt;code&gt;X-Amz-Tenant-Id&lt;/code&gt;. It could be the source AWS account id, a value from the request body, or a header value. If authentication and authorization are enabled, it could even be a Cognito user group or a claim from a JWT token. This makes API Gateway a convenient place to resolve and pass the tenant id to Lambda.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd like to see next
&lt;/h2&gt;

&lt;p&gt;While Lambda tenant isolation is a solid step forward, there are a few areas where I think it can be even more valuable.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;More integration support&lt;/strong&gt;: Currently only API Gateway supports as a integration to Lambda with this feature. It would be great to see this extended to other integrations such as SQS to Lambda via event source mapping.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native per-tenant metrics&lt;/strong&gt;: Built-in CloudWatch metrics by tenant id would remove the need to publish custom metrics manually.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-tenant concurrency controls&lt;/strong&gt;: The ability to set concurrency limits per tenant would help prevent a noisy tenant from consuming all the available concurrency.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Tenant isolation is a fundamental requirement in many SaaS applications. While the single-tenant approach provides strong isolation, it comes with significant operational overhead. The multi-tenant approach is operationally simpler but introduces security risks with shared execution environments.&lt;/p&gt;

&lt;p&gt;Lambda tenant isolation gives us the best of both worlds. We get a single Lambda function that is easy to manage and deploy, while Lambda service ensures that the execution environments are isolated per tenant. This eliminates the risk of data leakage across tenants without the burden of managing separate Lambda functions for each tenant.&lt;/p&gt;

&lt;p&gt;This is a great addition to the serverless toolbox. If you are building multi-tenant SaaS applications on AWS Lambda, this feature is worth exploring.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;AWS Lambda tenant isolation documentation: &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/tenant-isolation.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/lambda/latest/dg/tenant-isolation.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Launch blog post: &lt;a href="https://aws.amazon.com/blogs/aws/streamlined-multi-tenant-application-development-with-tenant-isolation-mode-in-aws-lambda" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/aws/streamlined-multi-tenant-application-development-with-tenant-isolation-mode-in-aws-lambda&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;👋 I regularly create content on &lt;strong&gt;AWS&lt;/strong&gt; and &lt;strong&gt;Serverless&lt;/strong&gt;, and if you're interested, feel free to follow/connect with me so you don't miss out on my latest posts!&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/pubudusj" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/pubudusj&lt;/a&gt;&lt;br&gt;
Twitter/X: &lt;a href="https://x.com/pubudusj" rel="noopener noreferrer"&gt;https://x.com/pubudusj&lt;/a&gt;&lt;br&gt;
Medium: &lt;a href="https://medium.com/@pubudusj" rel="noopener noreferrer"&gt;https://medium.com/@pubudusj&lt;/a&gt;&lt;br&gt;
Personal blog: &lt;a href="https://pubudu.dev" rel="noopener noreferrer"&gt;https://pubudu.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>lambda</category>
      <category>serverless</category>
      <category>saas</category>
    </item>
    <item>
      <title>Simple Leave Management with AWS Lambda Durable Functions</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Wed, 14 Jan 2026 00:17:16 +0000</pubDate>
      <link>https://forem.com/aws-builders/simple-leave-management-with-aws-lambda-durable-functions-if2</link>
      <guid>https://forem.com/aws-builders/simple-leave-management-with-aws-lambda-durable-functions-if2</guid>
      <description>&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;In AWS re:Invent 2025, Lambda introduced Durable Functions with a great set of features. One of the main features is that a single execution can span up to one year. Also with built-in checkpointing it is possible to track the steps that were already completed and when an execution retried from a resume or an interruption, completed steps will be skipped and resumes from the next step.&lt;/p&gt;

&lt;p&gt;When it comes to orchestrating multiple AWS services into a workflow, up until now, AWS Step Functions was the best choice. However, now Durable Functions too offer similar functionality to build a workflow within the same familiar Lambda environment, which is great!&lt;/p&gt;

&lt;p&gt;In this blog post, I explain how Durable Function's callback feature can be used for a human-in-the-loop functionality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Durable Function callbacks
&lt;/h2&gt;

&lt;p&gt;When an execution starts, Durable Function will start an invocation. When there are multiple steps within the execution, it is possible to set the execution on hold (or wait) in a certain step for a given time period or until a signal is received from an external process. In case of a external signal, once the signal is received by the Lambda service, based on the success or failure nature of the signal, the execution that was on hold can be resumed or completely terminated.&lt;/p&gt;

&lt;p&gt;This is a great feature supported natively by Durable Functions that is helpful for an example, in a business process where a human approval is required to continue with the flow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Simple leave management with Durable Functions
&lt;/h2&gt;

&lt;p&gt;In this example, an employee can send a leave request and the manager can approve or reject the request. Below are the steps involved and how Durable Function is being used.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Employee sends a leave request. Durable function starts an execution.&lt;/li&gt;
&lt;li&gt;A Leave is created in a DynamoDB table with pending state.&lt;/li&gt;
&lt;li&gt;The employee receives an email with the request receipt confirmation.&lt;/li&gt;
&lt;li&gt;The manager receives an email with a callback id to be used to approve or reject the leave.&lt;/li&gt;
&lt;li&gt;Durable function keeps the execution on hold until manager approval or rejection is received.&lt;/li&gt;
&lt;li&gt;Once the manager approves or rejects, the execution resumes.&lt;/li&gt;
&lt;li&gt;If the manager didn’t process the request within a given time period, the request will be expired.&lt;/li&gt;
&lt;li&gt;The leave status is updated in the DynamoDB table.&lt;/li&gt;
&lt;li&gt;The employee receives an email with the manager’s decision or expiry.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftnwlydjs7sytrj8n6f13.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftnwlydjs7sytrj8n6f13.png" alt="Image: Architecture" width="800" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;There is a proxy function which accepts and validates the incoming request to create a leave. Here I used a Lambda Function URL to submit the request.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Once validated, the proxy function will invoke the durable function.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why use the proxy function?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Even with Durable functions, the synchronous invocations are limited by 15 minutes. Since I need the durable function to run longer, it is required to trigger the durable function asynchronously. Using the proxy function, durable function is invoked with invocation type ‘Event’, where it just fire-and-forget and will not wait for the durable function to complete.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When the durable function starts its execution, first a leave record is created in the DynamoDB table.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then, an email is sent to the employee confirming the receipt of the leave request.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the next step, an email is sent to the manager asking to approve or reject the leave request. This step is a callback step. The execution waits at this step until the manager approves or rejects.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Since this is a callback step, there is a callback id generated. This callback id is indicated in the email to the manager.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;To simulate the manager’s approval or rejection, there is a Function URL exposed to trigger the &lt;em&gt;Process Leave Lambda&lt;/em&gt; function.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This function accepts &lt;em&gt;callback_id&lt;/em&gt; and the &lt;em&gt;decision&lt;/em&gt; (approve or reject) as input.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Based on the decision, this function will call the boto3 methods of Lambda &lt;em&gt;send_durable_execution_callback_success&lt;/em&gt; or &lt;em&gt;send_durable_execution_callback_failure&lt;/em&gt; using the given callback id.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Please note:&lt;/strong&gt; These latest SDK methods are not available with the boto3 version provided by Lambda by default as of now. So, it is required to package boto3 version 1.42.1 or newer with your Lambda source code.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When Lambda service receives the decision, the durable execution that was on hold will re-start.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If the manager’s decision was not received on time (I set this to 5 minutes at the moment), the wait_for_callback step will throw an &lt;em&gt;CallableRuntimeError&lt;/em&gt; exception with an error message - &lt;em&gt;Callback timed out&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Based on the decision or expiry of callback, the leave record in DB is updated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then an email will be sent to the employee with the leave's final status and the execution is completed.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Try this yourself
&lt;/h2&gt;

&lt;p&gt;Here is a Github repository of a sample project I created with AWS CDK and Python for you to try out this scenario in your AWS account.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/pubudusj/simple-leave-management-with-durable-functions" rel="noopener noreferrer"&gt;https://github.com/pubudusj/simple-leave-management-with-durable-functions&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Clone the repository.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Copy the .env.example into .env and add the values. The value for &lt;em&gt;SYSTEM_FROM_EMAIL&lt;/em&gt; should be already configured in Simple Email Service (SES) to send emails to the manager and employee email addresses.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Deploy the stack with CDK.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Once deployed, there will be two Lambda function urls in the output - one for creating the leave request and the other for process leave request (as manager).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Send a POST request to the create leaves endpoint with a payload similar to below:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"start_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"2026-01-10"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"end_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-01-20"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"employee_email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"employee@email.com"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This will send an email to the employee email address as follows:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2syxb7rcmlvsereo3qbk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2syxb7rcmlvsereo3qbk.png" alt="Image: Leave submitted notification for employee" width="800" height="281"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Also, an email to the manager with the callback id as follows:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvlz554e4smlp10qw26qt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvlz554e4smlp10qw26qt.png" alt="Image: Leave approval request for manager" width="800" height="281"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Manager can send a POST request to the leave process endpoint with the callback id in the email with his decision (approve or reject) as follows:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"callback_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"[callback id from email]"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"decision"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"approve"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Based on the decision, the employee will receive an email with the status.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjyq7cb30shx23x62244.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjyq7cb30shx23x62244.png" alt="Image: Leave decision notification for employee" width="800" height="281"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If the manager didn't process the decision within given time (for demo purposes, this was set to 5 minutes), the leave will be marked as expired and employee will receive an email as follows:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F66y9u4x1xi4zdpvzwdw2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F66y9u4x1xi4zdpvzwdw2.png" alt="Image: Leave expired notification for employee" width="800" height="281"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Please try this and let me know your thoughts!&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;AWS Lambda Durable Functions documentation: &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/durable-functions.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/lambda/latest/dg/durable-functions.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Durable execution SDK: &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/durable-execution-sdk.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/lambda/latest/dg/durable-execution-sdk.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>lambda</category>
      <category>durablefunctions</category>
    </item>
    <item>
      <title>Monitoring multiple dynamic resources using a single Amazon CloudWatch alarm</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Tue, 21 Oct 2025 09:18:26 +0000</pubDate>
      <link>https://forem.com/aws-builders/monitoring-multiple-dynamic-resources-using-a-single-amazon-cloudwatch-alarm-3lbi</link>
      <guid>https://forem.com/aws-builders/monitoring-multiple-dynamic-resources-using-a-single-amazon-cloudwatch-alarm-3lbi</guid>
      <description>&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;When you need to monitor your resources with a CloudWatch alarm, what you normally have to do is to create an alarm with a specific matric of that resource. Although this gives a granular level of monitoring into your resources, you always have to add or remove alarms as and when you have new resources or when you remove a specific resource. Which is an operational overhead although it can be automated if you are using infrastructure as code tool.&lt;/p&gt;

&lt;p&gt;Another option available is to use aggregated metrics in your alarms such as CPUUtilization for EC2 so you have coverage, but at high level into a group of resources you need to monitor. The downside of this is that it lacks granular visibility into your individual resources. Also, there is only a limited number of resources that support aggregated metrics.&lt;/p&gt;

&lt;p&gt;In September 2025, Amazon CloudWatch introduced a nice feature of allowing monitor multiple individual metrics via a single alarm using CloudWatch Metric Insights. By using Metric insight SQL queries in the alarm, it automatically updates its query results with each evaluation and adjusts in real time when resources are created and deleted.&lt;/p&gt;

&lt;p&gt;With the introduction of the multi-metric alarms, you can now get both granular level per resource monitoring as well as less maintenance where you don’t need to update alarms when resources are created or removed from your application because the alarm itself will automatically discover and monitor the resources.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;You can define a generic SQL statement as the alarm metric.&lt;/p&gt;

&lt;p&gt;For example, let’s assume that you would like to monitor all your SQS queues for available messages and if there are any messages available you need to be notified.&lt;/p&gt;

&lt;p&gt;For this requirement, you can define a query as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ApproximateNumberOfMessagesVisible&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;"AWS/SQS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;QueueName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;QueueName&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this query, the alarm will monitor all your queues for ApproximateNumberOfMessagesVisible count and trigger the alarm if a specific queue has at least one visible message. Also, if you add a new queue after the alarm is created, it will be taken into account when evaluating this query.&lt;/p&gt;

&lt;p&gt;Also, you can use tags to filter the resources you need to monitor. For example…&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ApproximateNumberOfMessagesVisible&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;"AWS/SQS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;QueueName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Component&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'consumer'&lt;/span&gt;
&lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Environment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'production'&lt;/span&gt;
&lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsDLQ&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'true'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;QueueName&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;It would have been a great feature if this supports wildcards or regex based filtering into the resource properties, but as of now, only ways to use filters are using tags or the complete property value as equal or not equal.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When you create the alarm with this Metrics Insights Query, you can see a new section on the alarm called “Contributors”. Here, you can see all the resources that match the conditions in the alarm query as well as the current state of each contributor.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5kkp1y03atyifts7qojt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5kkp1y03atyifts7qojt.png" alt="Image: CloudWatch multi metric alarm contributors." width="800" height="427"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: CloudWatch multi metric alarm contributors.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Alarm states
&lt;/h2&gt;

&lt;p&gt;When a single contributor metric breached the threshold, then its state will be changed to in alarm state. And in the alarm details, it is clearly defined which contributor caused the alarm to change the state. This is super helpful to identify the exact resource that triggered the alarm.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmqcyajbocr6agkofbmvf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmqcyajbocr6agkofbmvf.png" alt="Image: CloudWatch multi metric alarm notification.&amp;lt;br&amp;gt;
" width="635" height="587"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: CloudWatch multi metric alarm notification.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The alarm action is based on the contributor transitions. Which means even when the alarm is already in ‘In alarm’ state, if another contributor breached the threshold and its state becomes ‘In alarm’, the alarm action will be triggered.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fflazxnnnyelqa26s0qd1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fflazxnnnyelqa26s0qd1.png" alt="Image: CloudWatch multi metric example alarm history.&amp;lt;br&amp;gt;
" width="800" height="288"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: CloudWatch multi metric example alarm history.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This is how the similar situation shows in the CloudWatch console.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl9y6ok180ace23wbzpmi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl9y6ok180ace23wbzpmi.png" alt="Image: CloudWatch multi metric example alarm graph.&amp;lt;br&amp;gt;
" width="800" height="598"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: CloudWatch multi metric example alarm graph.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Try this yourself
&lt;/h2&gt;

&lt;p&gt;I have created a Github repository with a SAM template that helps to deploy some AWS resources into your AWS account and try this scenario.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pre-requisites
&lt;/h3&gt;

&lt;p&gt;You need to enable the CloudWatch setting “Resource tags for telemetry” if you need to use tag based filters in these queries. Go to CloudWatch &amp;gt; settings &amp;gt; Enable resource tags on telemetry in your region to enable this.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Clone the repository: &lt;a href="https://github.com/pubudusj/cloudwatch-multi-metrics-alarm" rel="noopener noreferrer"&gt;https://github.com/pubudusj/cloudwatch-multi-metrics-alarm&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Deploy the stack using AWS SAM CLI.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Once the stack is deployed, you can see in the created CloudWatch alarm, the contributors are available.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When you add some messages into one of the SQS queues, you can see the alarm state changed and you will receive the alarm notification which includes the contributor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can uncomment line 64 - 74 and deploy again to create a new resource.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This new resource will be evaluated and you can see it is listed in the contributors list.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you add a message to the new queue, the alarm triggers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Likewise, any resource new or old that matches the query will be considered to evaluate the alarm.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;There are some notable limitations in this approach.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Currently a single query can return no more than 500 time series and it is a hard limit. Which means as per the above query, maximum 500 resources can be monitored, which might not be not sufficient for a application with a lot of resources. When you have more than 500 resources to monitor, you will have to use multiple alarms.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There is no wildcard or regex supported in the query (comparing to some other features in CloudWatch)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The new feature of CloudWatch is great for monitoring dynamic resources in your application without the management overhead of adding/removing alarms when the resources are created and deleted. There are some hard limits where this might not be suitable for all the situations. However, I believe there will be more improvements added to this feature in near future - like wild card or regex based resource filtering, that will give better developer experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;p&gt;Release news: Amazon CloudWatch query alarms now support monitoring metrics individually. &lt;a href="https://aws.amazon.com/about-aws/whats-new/2025/09/amazon-cloudwatch-alarm-multiple-metrics/" rel="noopener noreferrer"&gt;https://aws.amazon.com/about-aws/whats-new/2025/09/amazon-cloudwatch-alarm-multiple-metrics/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Creating a Metrics Insights CloudWatch alarm. &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch-metrics-insights-alarm-create.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch-metrics-insights-alarm-create.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Metrics Insights quotas &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch-metrics-insights-limits.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch-metrics-insights-limits.html&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;👋 I regularly create content on &lt;strong&gt;AWS&lt;/strong&gt; and &lt;strong&gt;Serverless&lt;/strong&gt;, and if you're interested, feel free to follow/connect with me so you don't miss out on my latest posts!&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/pubudusj" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/pubudusj&lt;/a&gt;&lt;br&gt;
Twitter/X: &lt;a href="https://x.com/pubudusj" rel="noopener noreferrer"&gt;https://x.com/pubudusj&lt;/a&gt;&lt;br&gt;
Medium: &lt;a href="https://medium.com/@pubudusj" rel="noopener noreferrer"&gt;https://medium.com/@pubudusj&lt;/a&gt;&lt;br&gt;
Personal blog: &lt;a href="https://pubudu.dev" rel="noopener noreferrer"&gt;https://pubudu.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>cloudwatch</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Detect EventBridge target failure: Part 2 - using enhanced monitoring</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Tue, 30 Sep 2025 10:06:27 +0000</pubDate>
      <link>https://forem.com/aws-builders/detect-eventbridge-target-failure-part-2-using-enhanced-monitoring-46fd</link>
      <guid>https://forem.com/aws-builders/detect-eventbridge-target-failure-part-2-using-enhanced-monitoring-46fd</guid>
      <description>&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;When delivering messages to different targets using EventBridge, it is important to get notified if there are any delivery failures. EventBridge doesn’t provide this out of the box, but there are several ways to achieve this.&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://pubudu.dev/posts/detect-eventbridge-target-failure-part-1/" rel="noopener noreferrer"&gt;1st part of this blog&lt;/a&gt; we discussed how we can get notified when a delivery fails to a target using a dead letter queue.&lt;/p&gt;

&lt;p&gt;In this blog post we will discuss another (better?) option to achieve the same using &lt;strong&gt;EventBridge enhanced logging&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  EventBridge Enhanced Logging
&lt;/h2&gt;

&lt;p&gt;On 15th July 2025 &lt;a href="https://aws.amazon.com/about-aws/whats-new/2025/07/amazon-eventbridge-enhanced-logging-improved-observability/" rel="noopener noreferrer"&gt;AWS introduced enhanced logging for EventBridge&lt;/a&gt;. Which means now you can enable logging and EvenBridge will send those logs into a configured log delivery location.&lt;/p&gt;

&lt;p&gt;Available log destinations are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3&lt;/li&gt;
&lt;li&gt;CloudWatch logs&lt;/li&gt;
&lt;li&gt;Amazon Data Firehose stream&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also you can configure the standard log levels as required: Trace, Info and Errors.&lt;/p&gt;

&lt;p&gt;You can select more than one log destination and for each destination, you can select the same or a different log level. This is really useful, for example, I can use S3 to log all the traces while using CloudWatch log stream to log only errors.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;In this example, I have used EventBridge enhanced logging to log any errors occurring into a CloudWatch log stream.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxbqm3hs7k1nl935awhe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxbqm3hs7k1nl935awhe.png" alt="Architecture" width="800" height="406"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: Architecture&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Simply, EventBridge will log any errors into the CloudWatch log. Once the log record is available in the CloudWatch log stream, there are multiple ways to trigger a CloudWatch alarm. In this example, I use the number of incoming log events as the metric to trigger the alarm. &lt;/p&gt;

&lt;p&gt;If you use a CloudWatch log stream with log level trace (which includes info and errors as well) and still you want to trigger an alarm based on any error occurring, you may use an option like creating a metric filter in the log group.&lt;/p&gt;
&lt;h2&gt;
  
  
  Try this yourself
&lt;/h2&gt;

&lt;p&gt;I have created a Github repository with a AWS SAM template for you to test this scenario in your AWS account.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Clone the Github repository: &lt;a href="https://github.com/pubudusj/event-bridge-target-failure-detection-with-enhanced-logging" rel="noopener noreferrer"&gt;https://github.com/pubudusj/event-bridge-target-failure-detection-with-enhanced-logging&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Deploy the stack using below command:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sam deploy \
--template-file template.yaml \
--stack-name eb-fail-detection-with-enhanced-logging \
--capabilities CAPABILITY_IAM \
--no-confirm-changeset \
--parameter-overrides NotificationEmail=[YourEmailAddress]
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Here, add your email address as &lt;code&gt;NotificationEmail&lt;/code&gt;, so you will get the notification into your email box when the target fails.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Once the stack is deployed, you will get a SNS subscription confirmation email. You need to confirm it in order to receive notifications.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Then, publish a message into the created event bus with the source as &lt;code&gt;xyzcorp&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;This way the message will match the rule and try to deliver the message to the target.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;I have blocked the permission for publishing the target intentionally to simulate the failure.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;In a moment, you should get an email with the alarms status.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;If you go to the CloudWatch log stream created, you can see the entry with log level &lt;code&gt;ERROR&lt;/code&gt; and message type &lt;code&gt;INVOCATION_FAILURE&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Please note:&lt;/strong&gt;&lt;br&gt;
As of now, creating the enhanced logging and delivering them to a delivery destination is not a straight forward configuration. You need to create a CloudWatch log group, a delivery source, a delivery destination and a logs delivery.&lt;/p&gt;

&lt;p&gt;Refer: &lt;a href="https://github.com/pubudusj/event-bridge-target-failure-detection-with-enhanced-logging/blob/main/template.yaml#L31-L59" rel="noopener noreferrer"&gt;https://github.com/pubudusj/event-bridge-target-failure-detection-with-enhanced-logging/blob/main/template.yaml#L31-L59&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Which means if you need to send these enhanced logs to multiple destinations, you need to repeat configuring those resources per destination.&lt;/p&gt;

&lt;p&gt;However, nice thing about this approach is that you only need to configure this only once on the event bus, and it will log the whole message life cycle within EventBridge systems including ingestion as well as deliveries to all the targets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Overall, enhanced logging is a great improvement to EventBridge because up until now, message delivery was a black box (for customers), specially the consumer side of EventBridge. With this new addition, you can track and debug the flow of your message within EventBridge systems transparently using the logs generated in each and every step that the message is going through, from ingest to delivery (of course depends on the log level that was configured).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Also, you can configure more than one log destinations as well as different log levels.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Since creating single log delivery using CloudFormation requires several AWS resources to be configured, I hope EventBridge team will provide easy to use method to configure this, ideally as properties of EventBridge Bus.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Monitor and debug event-driven applications with new Amazon EventBridge logging: &lt;a href="https://aws.amazon.com/blogs/aws/monitor-and-debug-event-driven-applications-with-new-amazon-eventbridge-logging/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/aws/monitor-and-debug-event-driven-applications-with-new-amazon-eventbridge-logging/&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;👋 I regularly create content on &lt;strong&gt;AWS&lt;/strong&gt; and &lt;strong&gt;Serverless&lt;/strong&gt;, and if you're interested, feel free to follow/connect with me so you don't miss out on my latest posts!&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/pubudusj" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/pubudusj&lt;/a&gt;&lt;br&gt;
Twitter/X: &lt;a href="https://x.com/pubudusj" rel="noopener noreferrer"&gt;https://x.com/pubudusj&lt;/a&gt;&lt;br&gt;
Medium: &lt;a href="https://medium.com/@pubudusj" rel="noopener noreferrer"&gt;https://medium.com/@pubudusj&lt;/a&gt;&lt;br&gt;
Personal blog: &lt;a href="https://pubudu.dev" rel="noopener noreferrer"&gt;https://pubudu.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>eventbridge</category>
      <category>eventdriven</category>
    </item>
    <item>
      <title>Detect EventBridge target failure: Part 1 - with dead letter queue</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Tue, 23 Sep 2025 20:20:31 +0000</pubDate>
      <link>https://forem.com/pubudusj/detect-eventbridge-target-failure-part-1-with-dead-letter-queue-4o73</link>
      <guid>https://forem.com/pubudusj/detect-eventbridge-target-failure-part-1-with-dead-letter-queue-4o73</guid>
      <description>&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;When EventBridge delivers messages to its target, there can be many reasons that cause failing to send a message. There can be permission issues, rate limits or the unavailability of the target or even can be a glitch in the AWS itself, just to name a few.&lt;/p&gt;

&lt;p&gt;No matter what the reason is, it is always ideal to get notified that there is an issue delivering messages and the reason for the failure. In this blog post I will discuss how a dead letter queue can be useful to get notified when the EventBridge fails to deliver messages to its target.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dead letter queues
&lt;/h2&gt;

&lt;p&gt;Dead letter queues are unsung heroes of the event driven architecture 😀. Those are easy to set up and manage yet greatly improve the resilience of a system. Also it is very cost effective.&lt;/p&gt;

&lt;p&gt;Let’s see how we can capture the target delivery failures in EventBridge using a dead letter queue.&lt;/p&gt;

&lt;p&gt;Please note that EventBridge supports DLQ in a couple of “levels”. EventBridge bus can have a DLQ itself, or you can set a DLQ per target basis. Let’s discuss the differences.&lt;/p&gt;

&lt;h2&gt;
  
  
  DLQ on EventBridge bus level
&lt;/h2&gt;

&lt;p&gt;EventBridge bus can have a DLQ of its own. However, this is limited to capture any errors related to the KMS encryption. EventBridge sends events that aren’t successfully encrypted to the DLQ.&lt;/p&gt;

&lt;p&gt;You can only see this DLQ setting in the EventBridge in AWS console only if customer managed KMS is used to encrypt messages. In fact, it is part of the Encryption settings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy124o0m6p3zkjlwtkxq4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy124o0m6p3zkjlwtkxq4.png" alt="Image: DLQ for Event bus only available when customer managed KMS is in use." width="800" height="353"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image: DLQ for Event bus only available when customer managed KMS is in use.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;However, &lt;strong&gt;this DLQ will NOT capture any target related failures&lt;/strong&gt;, so that we cannot use this for our purpose.&lt;/p&gt;

&lt;h2&gt;
  
  
  DLQ on EventBridge target level
&lt;/h2&gt;

&lt;p&gt;When EventBridge cannot deliver a message to a target, we can set up a SQS queue to put that message in, on the target level.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0k4oi090bk31fmj8j19a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0k4oi090bk31fmj8j19a.png" alt="DLQ on target" width="800" height="301"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Image: DLQ on target.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Since one rule can have more than one target, each target can have different DLQs as well. You can use the same SQS queue as the DLQ for all the targets, but you have to configure it for each and every target separately. It may sound like repetitive work, but if you use an infrastructure as a code tool like CDK or CloudFormation, this is not complex.&lt;/p&gt;

&lt;h3&gt;
  
  
  How it works
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwblxhq8g9k3b6ljowecu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwblxhq8g9k3b6ljowecu.png" alt="High level architecture" width="800" height="311"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: High level architecture.&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;EventBus tries to deliver a message to its target (here, it is a SQS queue) via EventBridge rule.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Let’s assume there is a permission issue, and the message cannot be delivered.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then, EventBridge will put the message into the DLQ configured for this specific target.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In CloudWatch, there is an alarm set up to be triggered whenever there is a message in the DLQ.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When the failed message is in the DLQ, Alarm triggers and there is a SNS topic configured as the alarm action.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;And when the alarm action publishes a message to SNS topic, it will send the notification to all the subscribers to notify about the failure.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Try this yourself
&lt;/h3&gt;

&lt;p&gt;I have created an AWS SAM template to try this scenario in your AWS account.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Clone the Github repository: &lt;a href="https://github.com/pubudusj/event-bridge-target-failure-detection-with-dlq" rel="noopener noreferrer"&gt;https://github.com/pubudusj/event-bridge-target-failure-detection-with-dlq&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Deploy the stack using below command:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight shell"&gt;&lt;code&gt;sam deploy &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--template-file&lt;/span&gt; template.yaml &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--stack-name&lt;/span&gt; event-bridge-target-failure-detection-with-dlq &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_IAM &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--no-confirm-changeset&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="nv"&gt;NotificationEmail&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;YourEmailAddress]
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Here, add your email address as &lt;code&gt;NotificationEmail&lt;/code&gt;, so you will get the notification into your email box when the target fails.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Once the stack is deployed, you will get a SNS subscription confirmation email. You need to confirm it in order to receive notifications.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Then, publish a message into the created event bus with the source as &lt;code&gt;xyzcorp&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;This way the message will match the rule and try to deliver the message to the target.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;I have blocked the permission for publishing the target intentionally to simulate the failure.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;In a moment, you should get an email with the alarms status.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Further, if you check the messages in DLQ, you can see the failed message and in the message attributes, you may see the reason of failure (depends on the reason).&lt;/p&gt;&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fniuqskek5wdmkfbkub83.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fniuqskek5wdmkfbkub83.png" alt="Message attributes of a failed message in DLQ" width="800" height="392"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: Message attributes of a failed message in DLQ.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;10.You can configure the threshold, period and evaluation periods of the alarm as needed to control the frequency of the notifications in case of a failure. &lt;a href="https://github.com/pubudusj/event-bridge-target-failure-detection-with-dlq/blob/main/template.yaml#L61-L63" rel="noopener noreferrer"&gt;https://github.com/pubudusj/event-bridge-target-failure-detection-with-dlq/blob/main/template.yaml#L61-L63&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;EventBridge bus has a DLQ but it is for a different purpose and cannot capture any target failures.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You can use this dead letter queue approach, to capture any messages which cannot be delivered to the target. Based on the no of messages in the queue, you can get notified using CloudWatch metric and SNS. However, you will need to configure it for each and every EventBridge target separately. Using an IAC tool to configure this may be convenient.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I will discuss another solution to achieve the same in part 2 of this blog post.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Using dead-letter queues to process undelivered events in EventBridge &lt;a href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-rule-dlq.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-rule-dlq.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;👋 I regularly create content on &lt;strong&gt;AWS&lt;/strong&gt; and &lt;strong&gt;Serverless&lt;/strong&gt;, and if you're interested, feel free to follow/connect with me so you don't miss out on my latest posts!&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/pubudusj" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/pubudusj&lt;/a&gt;&lt;br&gt;
Twitter/X: &lt;a href="https://x.com/pubudusj" rel="noopener noreferrer"&gt;https://x.com/pubudusj&lt;/a&gt;&lt;br&gt;
Medium: &lt;a href="https://medium.com/@pubudusj" rel="noopener noreferrer"&gt;https://medium.com/@pubudusj&lt;/a&gt;&lt;br&gt;
Personal blog: &lt;a href="https://pubudu.dev" rel="noopener noreferrer"&gt;https://pubudu.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>eventbridge</category>
      <category>sqs</category>
    </item>
    <item>
      <title>EventBridge to SQS when cross region and cross account</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Wed, 17 Sep 2025 07:43:30 +0000</pubDate>
      <link>https://forem.com/aws-builders/eventbridge-to-sqs-when-cross-region-and-cross-account-1jgd</link>
      <guid>https://forem.com/aws-builders/eventbridge-to-sqs-when-cross-region-and-cross-account-1jgd</guid>
      <description>&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;Delivering a message to a target SQS queue from EventBridge is a very common requirement in an event driven application. Sometimes, this SQS queue can be in a different region and may be in a different AWS account. Depending on where the target SQS queue lives, the approach to how you set up the solution differs.&lt;/p&gt;

&lt;p&gt;In this blog post, we are going to see different scenarios on how EventBridge can deliver messages to a target SQS queue when the target SQS queue is in a different region or different AWS account.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scenario 1: Event bus, rule and target SQS queue in same AWS account, same region
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo0enfekt6oulsn38kiwg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo0enfekt6oulsn38kiwg.png" alt="Image: Event bus, rule and target SQS queue in same AWS account, same region" width="800" height="353"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is the most common and simplest scenario. Event bus has a rule which directly configured a SQS queue as a target. All the resources are in the same region in the same AWS account.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scenario 2: Event bus and rule in one region, target SQS queue in another region in the same AWS account
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fik5fra82u1wwz9wfwhrx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fik5fra82u1wwz9wfwhrx.png" alt="Image: Event bus and rule in one region, target SQS queue in another region in the same AWS account" width="800" height="247"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this scenario, the Event bus and the rule is in one region. The target SQS queue is in the same AWS account, but in a different region.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;As of today (September 2025), EventBridge supports SQS as a target only when it is in the same region.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Because of that, first you have to create a new Event bus (or use the default Event bus) in the 2nd region. And configure the SQS queue as a target of that Event bus.&lt;br&gt;
Then, configure the event bus in the second region to be a target in the EventBridge rule of the first region.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scenario 3: Event bus, rule in one AWS account, target SQS queue in another AWS account, but all are in the same region
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fknt2qurpru3yf889p13a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fknt2qurpru3yf889p13a.png" alt="Image: Event bus and rule in one AWS account, target SQS queue in another AWS account, but all are in the same region" width="800" height="244"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AWS introduced cross account support for EventBridge targets beginning of 2025. Previously, when delivering messages to cross account resource, you had to use a Event bus in the 2nd account to establish the connection (similar to scenario 2 above).&lt;/p&gt;

&lt;p&gt;With this new feature, it is possible to have a cross account target which simplifies the message delivery from an event bus to a SQS queue in another account.&lt;/p&gt;

&lt;p&gt;However, there is a limitation. Although the cross account delivery is possible, the EventBridge bus and the rule and the target in the 2nd AWS account &lt;strong&gt;must&lt;/strong&gt; be in the same region.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scenario 4: Event bus, rule in one AWS account, target SQS queue in a different region in another AWS account
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftzsr1y7ksa2mbxfx5fzm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftzsr1y7ksa2mbxfx5fzm.png" alt="Image: Event bus, rule in one AWS account, target SQS queue in a different region in another AWS account" width="800" height="244"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As mentioned earlier, cross account delivery is supported only when the target SQS queue is in the same region. If the SQS queue is in a different region, you have to use an Event bus in the 2nd account in between.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The pattern is simple. SQS can be a target in an EventBridge rule when the rule and the SQS queue both are in the same region. Target SQS queue can be in another AWS account as long as the region is the same.&lt;/p&gt;

&lt;p&gt;If the SQS queue is in a different region than the EventBridge rule, you have to use an intermediate EventBus in the 2nd region. This intermediate EventBus and the target SQS can be in another AWS account.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Event bus targets in Amazon EventBridge - &lt;a href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-targets.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-targets.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Introducing cross-account targets for Amazon EventBridge Event Buses - &lt;a href="https://aws.amazon.com/blogs/compute/introducing-cross-account-targets-for-amazon-eventbridge-event-buses/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/compute/introducing-cross-account-targets-for-amazon-eventbridge-event-buses/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;👋 I regularly create content on &lt;strong&gt;AWS&lt;/strong&gt; and &lt;strong&gt;Serverless&lt;/strong&gt;, and if you're interested, feel free to follow/connect with me so you don't miss out on my latest posts!&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/pubudusj" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/pubudusj&lt;/a&gt;&lt;br&gt;
Twitter/X: &lt;a href="https://x.com/pubudusj" rel="noopener noreferrer"&gt;https://x.com/pubudusj&lt;/a&gt;&lt;br&gt;
Medium: &lt;a href="https://medium.com/@pubudusj" rel="noopener noreferrer"&gt;https://medium.com/@pubudusj&lt;/a&gt;&lt;br&gt;
Personal blog: &lt;a href="https://pubudu.dev" rel="noopener noreferrer"&gt;https://pubudu.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>sqs</category>
      <category>eventbridge</category>
    </item>
    <item>
      <title>How I built a spelling game with AWS Serverless and GenAI</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Wed, 02 Apr 2025 14:48:34 +0000</pubDate>
      <link>https://forem.com/aws-builders/how-i-built-a-spelling-game-with-aws-serverless-and-genai-5d65</link>
      <guid>https://forem.com/aws-builders/how-i-built-a-spelling-game-with-aws-serverless-and-genai-5d65</guid>
      <description>&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;Last December 2024, as part of the &lt;a href="https://awsdevchallenge.devpost.com/" rel="noopener noreferrer"&gt;AWS Game Builder Challenge&lt;/a&gt;, I built a simple spelling game. For that, I used AWS Serverless and GenAI services and CDK was used as the IAC tool. This was my first experience building something with the help of GenAI and using Bedrock in an application. This blog post explains the project and my experience building this simple spelling game.&lt;/p&gt;

&lt;p&gt;This is the final result - the game you can play :)&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://spelling-game.pubudu.dev/" rel="noopener noreferrer"&gt;https://spelling-game.pubudu.dev/&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How to play the game?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Player can select a language to play the game. As of now, only English (US) and Dutch are the available languages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When the player selects a language, there will be a maximum 5 words generated. This includes an audio, brief meaning of the word and the number of characters for each word.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then the player needs to fill the word in the text box given.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There is an indicator next to the text box how many characters have been entered in the text box and how many characters are required for the word. This indicator will be red until the required number of characters are filled, then it turns into green.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There is a timer that starts as soon as the words are generated. This is based on the number of words generated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When the remaining time is less than 30 seconds, the background of the page as well as the background of the timer turns into red.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Player can submit the answers before the timer runs out, else it will be automatically submitted.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then, the answers are evaluated and based on the number of correct answers, there is a pop up visible.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If all the answers are correct, there will be a "Confetti" effect appearing on the page.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Clicking on the 'show results' button on the pop up, the player can see the correct/incorrect answers and the correct word (in case of an incorrect answer).&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Implementation
&lt;/h2&gt;

&lt;p&gt;There are 2 main parts of this application.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Backend - where words are generated and APIs are available.&lt;/li&gt;
&lt;li&gt;Frontend - Vue.js application for the player to interact.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Source code
&lt;/h3&gt;

&lt;p&gt;There are two repositories available with the complete source code.&lt;/p&gt;

&lt;p&gt;Backend - &lt;strong&gt;&lt;a href="https://github.com/pubudusj/spelling-game-backend" rel="noopener noreferrer"&gt;https://github.com/pubudusj/spelling-game-backend&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Frontend - &lt;strong&gt;&lt;a href="https://github.com/pubudusj/spelling-game-frontend" rel="noopener noreferrer"&gt;https://github.com/pubudusj/spelling-game-frontend&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment instructions:
&lt;/h3&gt;

&lt;p&gt;Backend is implemented using AWS CDK and can be deployed as a generic CDK application. The URL of the Cloudfront distribution is required for the frontend to work.&lt;/p&gt;

&lt;p&gt;In the frontend, add the Cloudfront API URL to the VITE_API_BASE_URL in the env file. Install necessary dependencies and then run the application using &lt;code&gt;npm run dev&lt;/code&gt; in the dev mode. Or you can build the frontend app using &lt;code&gt;npm run build&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When the backend stack is deployed, it will output the S3 hosting bucket name. You can copy the built frontend app into this bucket which will host the frontend via the Cloudfront distribution created in the backend stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Backend
&lt;/h2&gt;

&lt;p&gt;There are 2 main components of the backend.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Words generator component - to generate and save words in the database.&lt;/li&gt;
&lt;li&gt;API component - to serve frontend.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Backend - words generator component
&lt;/h3&gt;

&lt;p&gt;Here is the high level overview of the words generator component and the steps within the state machine. Within Step Functions, there will be many AWS services called as explained below.&lt;/p&gt;

&lt;p&gt;Image: Words generator overview:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frncoqacegq8bh2jr14jd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frncoqacegq8bh2jr14jd.png" alt="Image: Words generator overview" width="800" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Image: Words generator state machine:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwku6yjwrhpjzfdmpo3eq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwku6yjwrhpjzfdmpo3eq.png" alt="Image: Words generator state machine" width="649" height="824"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Words Generator Step Functions state machine is responsible for generating words.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This Step function execution takes the language code as an input. ex: en-US, nl-NL, etc.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;As the first step, Bedrock InvokeModel is being called to generate 5 words with descriptions for each word, based on the language.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Here, the Anthropic Claude 3 Haiku model is being used which gives better balance of accuracy and the pricing in this scenario.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Here is the prompt I used to generate the words:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Generate 5 unique words that have a random number of characters more than 4 and less than 10 in Dutch language. For each word, provide a brief description of its meaning in English with more than a couple of words. Produce output only in a minified JSON array with the keys word and description. Word must always be in lowercase."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Here, the response is a JSON string.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Within the step's Result selector, the response will be converted to an array using the intrinsic function - &lt;em&gt;StringToJson&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"words.$"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"States.StringToJson($.Body.content[0].text)"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Then there is a map state where each word is an input to each map.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Within the map state, there are branches (currently two) based on the language.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;In each branch, there is a step to synthesise the word using Polly using &lt;em&gt;StartSpeechSynthesisTask&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;em&gt;StartSpeechSynthesisTask&lt;/em&gt; is an async operation. So, there is an immediate step to check if the synthesis task is completed using Polly's &lt;em&gt;GetSpeechSynthesisTask&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Great thing about the &lt;em&gt;StartSpeechSynthesisTask&lt;/em&gt; api is that it not only synthesises, but automatically saves the mp3 into the given S3 bucket.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;If the speech synthesis task is not finished based on the status of the &lt;em&gt;GetSpeechSynthesisTask&lt;/em&gt;, it waits and retries the status check.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Once synthesis is done, the execution continues to save word to DynamoDB step.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;In this step, we use DynamoDB's &lt;em&gt;PutItem&lt;/em&gt; api to save the generated data to the table. One record consists of below data:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pk - Primary key in the format of Word#{LanguageCode}. ex: Word#en-US
sk - MD5 hash of the word - here, intrinsic function States.Hash($.word, 'MD5')is in use. 
word - The word generated by Bedrock.
description - The description generated by Bedrock.
s3file - Mp3 file location provided by Polly synthesis task.
charcount - Character count of the word. This is retrieved from Polly synthesis task.
updated_at - Update timestamp.
&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Since this synthesis and save to db task runs on a map state, after a single execution, there will be a maximum 5 new words available in the ddb for the given language.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;There are two EventBridge Schedules running every 5 minutes to call this State Machine with the different language codes - English and Dutch.&lt;/p&gt;&lt;/li&gt;

&lt;/ol&gt;

&lt;h3&gt;
  
  
  Backend - API component
&lt;/h3&gt;

&lt;p&gt;API component is to create resources to interact with the frontend.&lt;/p&gt;

&lt;p&gt;There are two APIs available in the backend.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;POST /questions&lt;/code&gt; - To generate questions using Step Functions to appear on the frontend.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;POST /answers&lt;/code&gt; - To validate the answers submitted by the player.&lt;/p&gt;

&lt;p&gt;Image: Frontend and API components:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4e9rfue4ecchptd0dk34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4e9rfue4ecchptd0dk34.png" alt="Frontend and API components" width="800" height="407"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Generate questions API
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Generate questions API accept one argument. Which is the language code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This /questions api has a proxy integration to a Lambda Function which will start an execution in Questions Generator State Machine synchronously using start_sync_execution SDK call.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This Questions Generator State Machine is in type - &lt;strong&gt;Express&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Image: Questions Generator State Machine:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg22oowlabf186ijztxh3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg22oowlabf186ijztxh3.png" alt="Questions Generator State Machine" width="649" height="824"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A sample input is as follows:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"en-US"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"iterate"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Here, "&lt;em&gt;iterate&lt;/em&gt;" is a hard coded array to start a map execution within the state machine.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Within the state machine, first the map state is executed based on the "iterate" array from the input.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Inside the map state, first, it fetches max 50 records from DynamoDB. Here, DynamoDB scan is used. However, in order to fetch some random data, a random &lt;em&gt;ExclusiveStartKey&lt;/em&gt; is in use with the help of the intrinsic function &lt;em&gt;UUID()&lt;/em&gt;. Also, &lt;em&gt;FilterExpression&lt;/em&gt; is used to filter the records applicable only for the given language code.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"TableName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:dynamodb:****"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ExclusiveStartKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pk"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"S.$"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"States.Format('Word#{}', $$.Execution.Input.language)"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"sk"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"S.$"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"States.UUID()"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"FilterExpression"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pk = :pk"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ExpressionAttributeValues"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;":pk"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"S.$"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"States.Format('Word#{}', $$.Execution.Input.language)"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ReturnConsumedCapacity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TOTAL"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Next, the number of item counts returned from the previous step is checked.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If the count is more than 0, then in the next step, single random record is selected from them. This step is a Pass state with transformation using Parameters, which uses intrinsic functions - &lt;em&gt;ArrayGetItem&lt;/em&gt;, &lt;em&gt;MathRandom&lt;/em&gt; and &lt;em&gt;ArrayLength&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"item.$"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"States.ArrayGetItem($.items,States.MathRandom(0, States.ArrayLength($.items)))"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then, the selected record is being sent to the Generate Pre signed URL step. This is a Lambda function, which generates a pre-signed URL for the s3file path of the record. So, from the frontend the mp3 file can be played using this pre-signed url. Also there is a transformation of data within this Lambda function. The expiry of the pre-signed url is set to minimum because it is only required within the session of the game.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This is the last step within the map state which outputs the record in below format.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"XXXX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Description of the word"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"charcount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"en-US"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Presigned-url"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Once all the map steps are completed, there is a final aggregation Lambda function - &lt;em&gt;GetUniqueResultsLambda&lt;/em&gt;. Since each map step is independent, there is a possibility of selecting the same record in more than one map state. This Lambda function simply removes such duplications.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;And the response is returned to the frontend as the response to the &lt;code&gt;/questions&lt;/code&gt; endpoint.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Validate answers API
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;POST /answers&lt;/code&gt; API is responsible for validating the answers submitted in the frontend.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This API endpoint has a proxy Lambda function which accepts the payload in below format:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"en-US"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"answers"&lt;/span&gt;&lt;span class="p"&gt;:[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"312e8b6583d4b65b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"word"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"effect"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"99b788c54c1a8265"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"word"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"perilous"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Within the Lambda function, it does a DynamoDB's &lt;em&gt;batch_get_item&lt;/em&gt; SDK call to fetch words per ids and match with the word provided in the API.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Then it returns the response in below format:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"312e8b6583d4b65b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"original_word"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"affect"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"correct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"99b788c54c1a8265"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"original_word"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"perilous"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"correct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Based on the value of "correct", frontend will calculate and show the results.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Frontend
&lt;/h2&gt;

&lt;p&gt;For the frontend, I have used a simple single page application built with Vue.js. I have very limited knowledge on frontend technologies. Because of that, I used &lt;a href="https://aws.amazon.com/q/developer/" rel="noopener noreferrer"&gt;Amazon Q Developer&lt;/a&gt; on VSCode to implement the frontend application.&lt;/p&gt;

&lt;p&gt;Almost 95% of the frontend application was built by Amazon Q Developer. I have asked different questions and in most of the cases, Amazon Q was able to analyse the code and generate the code as per my requirements. This was a step by step process where I asked Amazon Q to generate specific functionality at a time.&lt;/p&gt;

&lt;p&gt;Here are some work in progress "&lt;em&gt;versions&lt;/em&gt;" of the application that was implemented and fine tuned step by step using Amazon Q.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3yeo3dtuph3dsvgei8s5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3yeo3dtuph3dsvgei8s5.png" alt="work in progress page" width="800" height="647"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2hy1jvns5z4r0ciyraq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2hy1jvns5z4r0ciyraq.png" alt="work in progress page" width="800" height="780"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhe7bvb3bthg4w73symes.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhe7bvb3bthg4w73symes.png" alt="work in progress page" width="800" height="557"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1kbbfk5fpymsbetzj2od.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1kbbfk5fpymsbetzj2od.png" alt="work in progress page" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucu4uey6h40s6dtwjukh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucu4uey6h40s6dtwjukh.png" alt="work in progress page" width="800" height="506"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc2lue9qcn981seooad2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc2lue9qcn981seooad2.png" alt="work in progress page" width="750" height="545"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here are some examples for the questions I asked from Amazon Q and how it analysed and generated code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2b1pcn2qfhjf0ee3lp2r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2b1pcn2qfhjf0ee3lp2r.png" alt="AmazonQ question" width="442" height="106"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fipygm95tjulc14c920v5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fipygm95tjulc14c920v5.png" alt="AmazonQ response" width="426" height="673"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjgznwgs1l5ggtl3n4bqc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjgznwgs1l5ggtl3n4bqc.png" alt="AmazonQ question" width="365" height="120"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fti41d0776j27o0lbzd6k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fti41d0776j27o0lbzd6k.png" alt="AmazonQ response" width="318" height="734"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqw8bgm2vs8y3bxiomfmx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqw8bgm2vs8y3bxiomfmx.png" alt="AmazonQ question" width="352" height="174"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixu84kiom1ood2n60pg2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixu84kiom1ood2n60pg2.png" alt="AmazonQ response" width="313" height="792"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcxcbbvw3djo8nnztvrr0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcxcbbvw3djo8nnztvrr0.png" alt="AmazonQ question" width="425" height="157"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fioo8vbuvjjpl81bld2bn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fioo8vbuvjjpl81bld2bn.png" alt="AmazonQ response" width="324" height="747"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons learnt
&lt;/h2&gt;

&lt;p&gt;Below are some of the lessons I learnt while I was working on this project, including some feedback on some of the services I used here.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Bedrock does not always return JSON. Within the prompt I used, I stated - "Produce output only in a minified JSON array with the keys word and description". However, once in a while, Bedrock returns data in different formats. This could have been more accurate if I add the beginning of the response in the prompt, so Bedrock can continue from there. However, this will increase the request token count for each API call. So, to avoid additional cost and also, the error rate is acceptable (since this is anyway a background job), I kept this prompt as it is.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Amazon Polly's &lt;em&gt;StartSpeechSynthesisTask&lt;/em&gt; doesn't support S3 path. We can provide the &lt;em&gt;OutputS3BucketName&lt;/em&gt; where the generated audio will be stored. However, we cannot specify a path to save the object. Instead, I have used the &lt;em&gt;OutputS3KeyPrefix&lt;/em&gt; parameter to provide the path with the language code so, the audio is saved in &lt;code&gt;s3://bucket_name/language_code/file_name.mp3&lt;/code&gt;&lt;br&gt;
However, one minor issue with that is, Polly always adds a dot (.) between the file name and the prefix. So, all the files generated in the sub path start with a dot.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cost of Polly. Polly has Generative and Neural text-to-speech engines apart from Standard. However, the cost of them is quite high compared to Standard. Also, those are available only for a limited number of languages. &lt;a href="https://aws.amazon.com/polly/pricing/" rel="noopener noreferrer"&gt;https://aws.amazon.com/polly/pricing/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Selecting random items from the DynamoDB table is hard. There is no straightforward way to achieve this. That's why I had to use a &lt;em&gt;ExclusiveStartKey&lt;/em&gt; and fetch maximum 50 items and select one random item. This might introduce some duplications. That's the reason I had to use the &lt;em&gt;GetUniqueResultsLambda&lt;/em&gt; which removes any duplicates from the map job.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I initially used direct integrations to start the Step Functions express workflow to generate questions directly from API Gateway. However, the VTL is complex to build specially to get the response in a specific format. So I stuck to the Lambda proxy option which was more simple.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Initially I have hosted the frontend using &lt;a href="https://docs.aws.amazon.com/amplify/latest/userguide/welcome.html" rel="noopener noreferrer"&gt;Amplify Hosting&lt;/a&gt;. However, I wanted to restrict the API Gateway access only from Amplify project, but there is no option yet. So I switched to Cloudfront, S3 set up. This is described in detail in one of my previous blogs: &lt;a href="https://dev.to/posts/access-api-gw-rest-api-only-from-cloudfront/"&gt;Enforce CloudFront-Only Access for AWS API Gateway&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For the frontend, I indeed did &lt;a href="https://en.wikipedia.org/wiki/Vibe_coding" rel="noopener noreferrer"&gt;Vibe coding&lt;/a&gt;. So, there is a high chance that any frontend specialist might find unacceptable code in it 😉.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Most of the time, Amazon Q generated code which includes the expected functionality. However, there were cases it couldn't fix some code snippets. For example, when I asked to center a component, it couldn't fix it even after 20th iteration. May be this is because of the complexity of the single page structure.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiuhu0fuyyublz18g29bj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiuhu0fuyyublz18g29bj.png" alt="Amazon Q attempts to center a div" width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Possible improvements
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Currently there is no 'history' of the games a particular player played. Adding a logged-in mode to record the player status/progress can be a nice feature.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add more languages: Currently only English(US) and Dutch are available to select. Having more languages is nice. However, those need to be supported by Polly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Finished game's standing against all other games in the results can be a nice feature. For this, recording the results and comparing against all the results will be required.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What are the other possible improvements you see?&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Overall, it was a great experience using Bedrock and Amazon Q. I didn't win the hackathon, but as this was the first time I was using those GenAI services, I learnt a lot from this project and enjoyed a lot. Frontend code was mostly generated by Amazon Q and I am happy with the results of it although I am certain there can be many improvements in the frontend code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful Links
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Getting started with Amazon Q Developer: &lt;a href="https://aws.amazon.com/q/developer/getting-started/" rel="noopener noreferrer"&gt;https://aws.amazon.com/q/developer/getting-started/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Getting started with Amazon Bedrock: &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/getting-started.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/bedrock/latest/userguide/getting-started.html&lt;/a&gt; &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Invoke and customize Amazon Bedrock models with Step Functions: &lt;a href="https://docs.aws.amazon.com/step-functions/latest/dg/connect-bedrock.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/step-functions/latest/dg/connect-bedrock.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Claude Prompt engineering overview: &lt;a href="https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview" rel="noopener noreferrer"&gt;https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;👋 I regularly create content on AWS and Serverless, and if you're interested, feel free to follow/connect with me so you don't miss out on my latest posts!&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/pubudusj" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/pubudusj&lt;/a&gt;&lt;br&gt;
Twitter/X: &lt;a href="https://x.com/pubudusj" rel="noopener noreferrer"&gt;https://x.com/pubudusj&lt;/a&gt;&lt;br&gt;
Dev.to: &lt;a href="https://dev.to/pubudusj"&gt;https://dev.to/pubudusj&lt;/a&gt;&lt;br&gt;
Personal blog: &lt;a href="https://pubudu.dev" rel="noopener noreferrer"&gt;https://pubudu.dev&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>amazonq</category>
      <category>bedrock</category>
    </item>
    <item>
      <title>Enforce CloudFront-Only Access for AWS API Gateway</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Thu, 06 Mar 2025 15:58:02 +0000</pubDate>
      <link>https://forem.com/aws-builders/enforce-cloudfront-only-access-for-aws-api-gateway-1hdd</link>
      <guid>https://forem.com/aws-builders/enforce-cloudfront-only-access-for-aws-api-gateway-1hdd</guid>
      <description>&lt;p&gt;Recently, I had a requirement to expose a REST API from API Gateway exclusively through CloudFront. Keeping API Gateway behind Cloudfront provides additional layer of security because Cloudfront comes with automatic protections of AWS Shield Standard, at no additional charge.&lt;/p&gt;

&lt;p&gt;There are several ways to achieve this, for example, using a signed request using Lambda@Edge in Cloudfront, but I went for a simpler solution, which is using a custom header in the request to API Gateway from the Cloudfront and using Lambda authorizer at the API gateway to validate it. This blog post explains how to implement this solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj2y33hae94fg36fu57lm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj2y33hae94fg36fu57lm.png" alt="Architecture" width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;In the Cloudfront distribution, we create an origin for API gateway endpoint.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Here, we define a custom header on this cloudfront origin.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When a request comes to Cloudfront distribution, as per the behaviour we define, it will call the API gateway endpoint.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This request will contain the custom header.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the API Gateway end, there is an Lambda authorizer which validates this incoming header.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There is a SSM secure parameter which holds the value of the header.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When it is required to validate the incoming header, Lambda authorizer will fetch this value from the parameter store.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Communication between the Cloudfront distribution and the API gateway is secure and there is no way someone who doesn't have access to Cloudfront settings will get to know the header value. However, keeping a static value for a validation seems not correct. So, it is better to rotate the value of the header for added security.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rotating the header value
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;There is a EventBridge scheduler which will invoke a Lambda function in a given interval.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This Lambda function will perform below actions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate a random value for the header.&lt;/li&gt;
&lt;li&gt;Update the SSM secure parameter.&lt;/li&gt;
&lt;li&gt;Update the Cloudfront origin's custom header value.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Once this Lambda execution is successful, in both sides (Cloudfront and API Gateway) the new header value will be used.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Try this yourself
&lt;/h2&gt;

&lt;p&gt;Here is a Github repository of the project I created to try out this solution. &lt;a href="https://github.com/pubudusj/secure-api-with-cloudfront" rel="noopener noreferrer"&gt;https://github.com/pubudusj/secure-api-with-cloudfront&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can deploy this to your AWS account using CDK.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Please note:&lt;/strong&gt; First, you need to create a secure SSM parameter with any value. Then create an &lt;code&gt;.env&lt;/code&gt; file copying the &lt;code&gt;.env.example&lt;/code&gt; file in the project root directory and set the name/path of the parameter in the .env file.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the CDK code, you will notice that the initial header value is hardcoded.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   origin=origins.RestApiOrigin(
       rest_api,
       origin_path="/prod",
       custom_headers={custom_header_key: "test"},
   ),
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also, this obviously does not match with the value you have in the SSM parameter.&lt;/p&gt;

&lt;p&gt;This is fine, because this stack includes a &lt;a href="https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-custom-resources.html" rel="noopener noreferrer"&gt;Cloudformation custom resource&lt;/a&gt;, which will be executed &lt;strong&gt;on create&lt;/strong&gt;. This custom resource will start the initial token rotation as soon as the stack is created which generates a new header value and syncs both SSM secure parameter and Cloudfront distribution.&lt;/p&gt;

&lt;p&gt;Once the stack is deployed, you can access the API using API Gateway endpoint and also with Cloudfront endpoint.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloudfront endpoint: &lt;code&gt;https://[CloudfrontPrefix].cloudfront.net/prod/hello&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;API Gateway endpoint: &lt;code&gt;https://[APIGWPrefix].execute-api.[region].amazonaws.com/prod/hello&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You will notice that Cloudfront endpoint works fine while there is a &lt;code&gt;401 Unauthorized&lt;/code&gt; error from the API Gateway endpoint.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tips/Lesson Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Here I have used an SSM Parameter store with a custom built rotation mechanism. However, you can use Secret manager instead with its inbuilt rotation features. Make sure you are aware of the differences. Yan has done this comparison in detail in his blog post &lt;a href="https://theburningmonk.com/2023/03/the-old-faithful-why-ssm-parameter-store-still-reigns-over-secrets-manager/" rel="noopener noreferrer"&gt;https://theburningmonk.com/2023/03/the-old-faithful-why-ssm-parameter-store-still-reigns-over-secrets-manager/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Keep in mind that when setting a new custom header value on the Cloudfront origin, it will perform a re-deployment. Cloudfront re-deployment takes time. It can be within a couple of minutes to 10-15 minutes too. But for a change like this, it should be faster.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Because of this reason, during a Cloudfront re-deployment, your SSM          parameter and incoming header value at API Gateway can be different.    One solution I used to address this is to cache the Lambda authorization result. Here I used a ttl of 5 minutes, but you can adjust as required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;authorizer = apigateway.RequestAuthorizer(
   self,
   "LambdaHeaderAuthorizer",
   handler=custom_authorizer,
   identity_sources=[apigateway.IdentitySource.header(custom_header_key)],
   results_cache_ttl=Duration.minutes(5),
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Useful Links
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Cloudfront custom headers: &lt;a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/add-origin-custom-headers.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/add-origin-custom-headers.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;API Gateway Lambda authorizer: &lt;a href="https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-use-lambda-authorizer.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-use-lambda-authorizer.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;Please let me know your thoughts on this implementation.&lt;/p&gt;

&lt;p&gt;You can find more AWS and Serverless contents at my personal blog: &lt;a href="https://pubudu.dev" rel="noopener noreferrer"&gt;https://pubudu.dev&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And don't forget to follow me on LinkedIn too:&lt;br&gt;
&lt;a href="https://www.linkedin.com/in/pubudusj" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/pubudusj&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>cloudfront</category>
      <category>apigateway</category>
    </item>
    <item>
      <title>SQS encryption options</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Wed, 04 Sep 2024 09:33:17 +0000</pubDate>
      <link>https://forem.com/aws-builders/sqs-encryption-options-21g9</link>
      <guid>https://forem.com/aws-builders/sqs-encryption-options-21g9</guid>
      <description>&lt;p&gt;When building distributed applications on AWS, Amazon Simple Queue Service (SQS) often becomes a crucial component in managing message flow between services. Ensuring the security of these messages is important, especially when dealing with sensitive data across multiple AWS accounts. In this post, we’ll explore different encryption options available for SQS and choosing the best option for scenarios like cross-account access.&lt;/p&gt;

&lt;h2&gt;
  
  
  In-Transit (as it travels to and from Amazon SQS) Encryption
&lt;/h2&gt;

&lt;p&gt;You can protect data in-transit using HTTPS (TLS). This ensures that messages are protected as they travel between your application and SQS, preventing attacks such as man-in-the-middle. You can enforce only encrypted connections over HTTPS (TLS) using &lt;code&gt;aws:SecureTransport&lt;/code&gt; condition in the queue policy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"Condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Bool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"aws:SecureTransport"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Another option for in-transit encryption is using client-side encryption where you encrypt data before sending it to SQS. Here you have to manage the encryption-decryption mechanism yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Server-Side Encryption
&lt;/h2&gt;

&lt;p&gt;Server-side encryption (SSE) adds an extra layer of security by encrypting the contents of your queue at the storage level. SSE protects the contents of messages in queues using SQS-managed encryption keys (SSE-SQS) or keys managed in the AWS Key Management Service (SSE-KMS).&lt;/p&gt;

&lt;h4&gt;
  
  
  SSE-SQS (SQS Managed Keys)
&lt;/h4&gt;

&lt;p&gt;This is the simplest option, where Amazon SQS takes care of the encryption keys for you. SQS generates, manages, and uses the encryption key, requiring no additional configuration on your part. &lt;a href="https://aws.amazon.com/about-aws/whats-new/2022/10/amazon-sqs-announces-server-side-encryption-ssq-managed-sse-sqs-default/" rel="noopener noreferrer"&gt;Since October 2022&lt;/a&gt;, this is by default enabled for any new SQS queue.&lt;/p&gt;

&lt;h4&gt;
  
  
  SSE-KMS (AWS Key Management Service Keys)
&lt;/h4&gt;

&lt;p&gt;This option provides more control by allowing you to use AWS Key Management Service (KMS) to manage the encryption keys. With SSE-KMS, you can use an existing KMS key or create a new one specifically for your SQS queue. This method enables finer-grained access control and auditing capabilities compared to SSE-SQS.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cross-account access with SSE
&lt;/h2&gt;

&lt;p&gt;As one of the most used messaging services between software components, cross account access is a very common scenario for SQS. Still, we need to make sure the messages that are exchanged between SQS queues are secured.&lt;/p&gt;

&lt;p&gt;In general, you can manage the cross account access for a SQS using its access policy. &lt;/p&gt;

&lt;p&gt;Below is an IAM policy of &lt;code&gt;my_sqs_queue&lt;/code&gt; in account &lt;code&gt;111111111111&lt;/code&gt;. This has granted the account &lt;code&gt;222222222222&lt;/code&gt; to send messages to &lt;code&gt;my_sqs_queue&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cross_account_access"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"AWS"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::222222222222:root"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"sqs:GetQueueAttributes"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"sqs:GetQueueUrl"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"sqs:SendMessage"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:sqs:eu-west-1:111111111111:my_sqs_queue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Depending on the server side encryption used in the queue, there will be additional permission required to grant send messages to &lt;code&gt;my_sqs_queue&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  If SSE-SQS is used
&lt;/h4&gt;

&lt;p&gt;Good news is if SSE-SQS is used, there are no additional encryption related permissions required by account &lt;code&gt;222222222222&lt;/code&gt;. Which means above IAM permission is sufficient to send messages to &lt;code&gt;my_sqs_queue&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  If SSE-KMS is used
&lt;/h4&gt;

&lt;p&gt;If this encryption is used, additional permission for the KMS key must be set in order to successfully send messages to &lt;code&gt;my_sqs_queue&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Let's assume &lt;code&gt;my_sqs_queue&lt;/code&gt; is encrypted using a KMS key in the same account &lt;code&gt;111111111111&lt;/code&gt; which has the alias &lt;code&gt;my_kms_key&lt;/code&gt;. In the &lt;code&gt;my_kms_key&lt;/code&gt;, you have to grant permission for account &lt;code&gt;222222222222&lt;/code&gt; as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"AWS"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::222222222222:root"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"kms:Encrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"kms:Decrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"kms:ReEncrypt*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"kms:GenerateDataKey*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"kms:DescribeKey"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:kms:eu-west-1:111111111111:key/[key-id]"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Further, in the account &lt;code&gt;222222222222&lt;/code&gt;, there should be permission &lt;code&gt;kms:GenerateDataKey&lt;/code&gt; for the KMS key as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kms:GenerateDataKey"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:kms:eu-west-1:111111111111:key/[key-id]"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion: What SSE method to choose?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Both SSE-SQS and SSE-KMS support cross-account access for SQS queues.&lt;/strong&gt; The key difference lies in how much control and responsibility you want over the encryption process.&lt;/p&gt;

&lt;p&gt;SSE-SQS is ideal when you need simple, effective encryption without the additional complexity of managing KMS keys. It's ideal for most general use cases where ease of setup and management is a priority.&lt;/p&gt;

&lt;p&gt;Use SSE-KMS when you require more control over your encryption keys and need to meet strict security and compliance requirements. This option is suited for environments where key management and detailed access control are critical.&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Amazon SQS security best practices: &lt;a href="https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-security-best-practices.html#implement-server-side-encryption" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-security-best-practices.html#implement-server-side-encryption&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Encryption at rest in Amazon SQS - Developer Guide : &lt;a href="https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-server-side-encryption.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-server-side-encryption.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Amazon SQS Key management - Developer guide: &lt;a href="https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-key-management.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-key-management.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>sqs</category>
      <category>encryption</category>
    </item>
    <item>
      <title>Dead Letter Queue (DLQ) for AWS Step Functions</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Mon, 27 May 2024 07:28:38 +0000</pubDate>
      <link>https://forem.com/aws-builders/dead-letter-queue-dlq-for-aws-step-functions-25eo</link>
      <guid>https://forem.com/aws-builders/dead-letter-queue-dlq-for-aws-step-functions-25eo</guid>
      <description>&lt;h2&gt;
  
  
  Intro
&lt;/h2&gt;

&lt;p&gt;When it comes to building a software system, one of the most critical components is error handling, because “everything fails all the time”. While it's impossible to anticipate every possible failure scenario, having a plan in place when an unexpected thing happens can always be helpful for robustness and resilience of the system.&lt;/p&gt;

&lt;p&gt;This ‘plan’ can be a simple &lt;strong&gt;Dead Letter Queue (DLQ)&lt;/strong&gt;!&lt;/p&gt;

&lt;p&gt;Dead Letter Queues can act as the safety net where you can keep the messages when unexpected issues arise. This helps you to isolate problematic messages, debug the issue without&lt;br&gt;
disrupting the rest of your workflow, and then retry or reprocess the message as needed.&lt;/p&gt;

&lt;p&gt;Many AWS Serverless services support SQS queues as the dead letter queue natively. However, Step Functions - one of the main Serverless services offered by AWS for workflow orchestration - &lt;strong&gt;does not support&lt;/strong&gt; dead letter queues &lt;strong&gt;natively&lt;/strong&gt; (yet).&lt;/p&gt;

&lt;p&gt;In this blog post, I am going to discuss a couple of workarounds to safely capture any messages that have failed to process by a Step Function into a dead letter queue.&lt;/p&gt;
&lt;h2&gt;
  
  
  Scenario
&lt;/h2&gt;

&lt;p&gt;Let’s consider a very common use case where we have a message in a SQS source queue, which needs to be processed by Step Functions. First the messages are read by a Lambda function that starts a step function execution for each message. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsr7zz4kxpvfxf2dx7h0h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsr7zz4kxpvfxf2dx7h0h.png" alt="Scenario" width="620" height="139"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here, once Step function execution lambda reads the message from Source Queue and successfully starts the Step function execution, the message will be marked as successfully processed and will be deleted from the queue. If there’s any errors in the Step Function execution, the message will be lost.&lt;/p&gt;
&lt;h2&gt;
  
  
  Solution 01
&lt;/h2&gt;

&lt;p&gt;In order to retain the message even when there is a failure in Step Function execution, we can add a &lt;strong&gt;2nd SQS queue which acts as a dead letter queue&lt;/strong&gt; as follows.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjo9svjtes8fmc35flt4c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjo9svjtes8fmc35flt4c.png" alt="Solution 1" width="679" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The state machine will look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbv7zlfpppkosf9vrgbtt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbv7zlfpppkosf9vrgbtt.png" alt="Solution 1 - state machine" width="419" height="416"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  How it works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Within the State machine, we can create a step namely “Send message to DLQ” which is a SDK integration for SQS SendMessage functionality.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In this step, the message to be sent to the DLQ is built based on the execution input retrieved from the context parameters.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the other steps of StateMachine, as required, we can configure the Catch errors in Error handling settings where we use the above “Send message to DLQ” step as the catcher.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This way, when an error happens in a state, we can send the message to the DLQ and re-process from there.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Try this yourself
&lt;/h3&gt;

&lt;p&gt;Here is a Github repository of a sample project I created to try out this solution.&lt;br&gt;
&lt;a href="https://github.com/pubudusj/dlq-for-stepfunctions" rel="noopener noreferrer"&gt;https://github.com/pubudusj/dlq-for-stepfunctions&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can deploy this to your AWS account using CDK.&lt;/p&gt;

&lt;p&gt;Once deployed you can test the functionality by sending a message to the source SQS queue in below expected format.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"foo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bar"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And you can see the message ends up in the DLQ and the "metadata" object now includes the "attempt_number" as well.&lt;br&gt;
As an alternative for using the metadata section of the message to set the attempt number, you may use SQS message attribute as well.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Please note:&lt;/strong&gt; In this approach, the DLQ is not a "real" DLQ and is not configured with the source SQS queue. However, it will help to capture any messages that failed to process by the Step Function execution.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Solution 02
&lt;/h2&gt;

&lt;p&gt;In this method, we will use a real DLQ that is configured with the source queue. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wk09kyszgnxwevsjvqn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3wk09kyszgnxwevsjvqn.png" alt="Solution 2" width="615" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The state machine will look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5t0nadkn9t1onhveir8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5t0nadkn9t1onhveir8.png" alt="Solution 2 - state machine" width="593" height="332"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  How it works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;There is a source SQS queue and a DLQ configured to it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the DLQ settings, the max receive count is set as &amp;gt; 1 so the message will be available in the DLQ immediately after the first failure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There is a Lambda function which has the trigger set up to process messages from the source queue and initialize SF execution.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the Event Source mapping setting of this lambda function, it is required to set the report_batch_item_failures to True.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;First, when the message is processed by the lambda function, we set the visibility timeout of the message to a larger value. This must be larger than the time it takes to complete the Step Function execution.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Then, the step function execution will be started. Here, we pass the SQS queue url and the message receipt handler values along with the original values from the message from SQS.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the example above, in order to determine if we need to simulate the failure, we use a simple choice state.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If it is a successful execution, we will call the step - Delete SQS Message.  Here we use the SQS SDK integration to delete the message using the SQS queue url and the receipt handle values received in the input payload.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If it is a failure, we will call a step named - “Set message visibility timeout 5s”.  Here we will use SQS SDK integration for the action:  “changeMessageVisibility” to set the SQS message’s visibility to 5 seconds. For this SDK integration, we use the SQS queue url and the SQS receipt handle values passed in the execution input.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Once the message visibility is set to 5 seconds, it will again appear on the source queue after 5 seconds. However, since we have the rule ‘max receive count’ set to more than 1, the message will be immediately put into the DLQ of the source queue.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Try this yourself
&lt;/h3&gt;

&lt;p&gt;I have another Github repository for  you to try this in your own AWS environment. You can set it up using CDK/Python.&lt;br&gt;
&lt;a href="https://github.com/pubudusj/dlq-for-stepfunctions-2" rel="noopener noreferrer"&gt;https://github.com/pubudusj/dlq-for-stepfunctions-2&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To simulate a failure scenario, send a message into the source queue with a "failed" value as True.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"foo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"bar"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"failed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will make the step function execution fail and the message will be immediately available in the DLQ of the source queue.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;With this approach, you can use the native DLQ functionality when we cannot process messages in the Step Function execution.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Step Functions is one of the widely used AWS Serverless services. However it doesn’t support dead letter queues (DLQs) natively yet. Still there are workarounds to achieve this with few simple steps. This blog post explained two of such workarounds which help to build a better resilient system.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>stepfunctions</category>
      <category>sqs</category>
    </item>
    <item>
      <title>Call external APIs with OAuth within Step Functions</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Sat, 09 Dec 2023 08:24:44 +0000</pubDate>
      <link>https://forem.com/aws-builders/call-external-apis-with-oauth-within-step-functions-5cme</link>
      <guid>https://forem.com/aws-builders/call-external-apis-with-oauth-within-step-functions-5cme</guid>
      <description>&lt;p&gt;Until last week, if you needed to call an external API from Step Functions execution, you had to use a Lambda function. And you needed to manage the responses and make the execution fail or success based on that. Also, most probably due to the fact that the external API is protected, custom code is required to manage the authorisation of the api call.&lt;/p&gt;

&lt;p&gt;As you see this involves significant custom code which needs to be maintained by the developers.&lt;/p&gt;

&lt;p&gt;Good news is, in the re:Invent 2023, AWS has introduced native integration to HTTPS APIs from Step Functions which allows to call to any 3rd party API endpoint as a step in the execution and, based on the response, you can perform any business logic remaining in the state machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;In Step Functions, it utilises &lt;a href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-api-destinations.html#eb-api-destination-connection" rel="noopener noreferrer"&gt;EventBridge connection&lt;/a&gt; to manage authentication credentials for the connection to the 3rd party API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create EventBridge Connection
&lt;/h3&gt;

&lt;p&gt;In the EventBridge connection, you can select 3 different authentication options.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Basic (Username/Password)&lt;/li&gt;
&lt;li&gt;OAuth Client Credentials&lt;/li&gt;
&lt;li&gt;API Key&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For this example, we will use OAuth client credentials as the authorisation mechanism.&lt;/p&gt;

&lt;p&gt;To create a EventBridge connection with OAuth client credentials, you have to provide below information.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Client id&lt;/li&gt;
&lt;li&gt;Client secret&lt;/li&gt;
&lt;li&gt;Auth endpoint &lt;/li&gt;
&lt;li&gt;HTTP method&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once the EventBridge connection creation request is received, first it stores the provided OAuth credentials in AWS Secret Manager.&lt;/p&gt;

&lt;p&gt;Next, it calls the auth endpoint with the given client id and client secret to obtain an auth token.&lt;/p&gt;

&lt;p&gt;If the auth call is successful, then the auth token will be securely saved in the same secret previously created in the secret manager. And this token will be used in the calls to the api endpoint.&lt;/p&gt;

&lt;p&gt;If the auth attempt failed, it simply de-register the EventBridge connection and it cannot no longer be usable.&lt;br&gt;
Once the EventBridge connection is successfully created, we can use it in a State Machine as the authentication option for the api endpoint.&lt;/p&gt;
&lt;h3&gt;
  
  
  Create State Machine with 3rd party API
&lt;/h3&gt;

&lt;p&gt;In the State Machine, simply add a "Call third-party API" step and, add the configurations - API endpoint, HTTP method. &lt;/p&gt;

&lt;p&gt;For authentication, enter the ARN of the EventBridge Connection created previously.&lt;/p&gt;

&lt;p&gt;Also, you can configure the request payload that sends to the API and you may use the reference paths to build the payload from the runtime data.&lt;/p&gt;
&lt;h3&gt;
  
  
  Refreshing OAuth Token
&lt;/h3&gt;

&lt;p&gt;Normally, OAuth uses short-lived tokens. When an API request uses a token that has already expired, it returns 401 Unauthorized error. In such a case, you will need to first renew the token by calling the auth endpoint and use the new token to call the API.&lt;/p&gt;

&lt;p&gt;In the Step Function HTTP API call, this is taken care of by the EventBridge Connection.&lt;/p&gt;

&lt;p&gt;Let's assume the API returns a 401 error which is the default OAuth response when the token is expired. In that case, EventBridge Connection will automatically call its auth endpoint and retrieve a new token. And update the secret manager entry.&lt;/p&gt;

&lt;p&gt;So, in the State Machine, you need to set up "Retry" for the "Call third-party API" step for the specific error - "States.Http.StatusCode.401". This retry will automatically resolve the unthorized error without any additional steps.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;You can always catch specific error using the HTTP error code as follows:&lt;br&gt;
&lt;strong&gt;States.Http.StatusCode.[Status_Code]&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In EventBridge Connection, OAuth tokens are refreshed when a &lt;strong&gt;401&lt;/strong&gt; or &lt;strong&gt;407&lt;/strong&gt; response is returned.&lt;/p&gt;

&lt;p&gt;Since Step Functions execution uses EventBridge Connections and Secret Manager, necessary permissions must be set in the State Machine role.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Try this yourself
&lt;/h2&gt;

&lt;p&gt;I have created a sample application that uses Step Function step to call an external API. Here, I have used CDK with Python. Github repository can be found at:&lt;br&gt;
&lt;a href="https://github.com/pubudusj/step-functions-https-api-integration" rel="noopener noreferrer"&gt;https://github.com/pubudusj/step-functions-https-api-integration&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Clone the repository.&lt;/li&gt;
&lt;li&gt;Install the dependencies and deploy the stack using CDK CLI.&lt;/li&gt;
&lt;li&gt;Once deployed, it will create 2 Lambda functions with Function URLs.&lt;/li&gt;
&lt;li&gt;One Function URL will be used as the API endpoint while the other will be used as the autorization endpoint.&lt;/li&gt;
&lt;li&gt;EventBridge connection is also created with autorization endpoint Function URL.&lt;/li&gt;
&lt;li&gt;And, a State Machine will be created with a single step to call the API endpoint and it has 2 retries set up if API returns status code 401. And authorization via the EventBridge Connection.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgjn24s4gaevyebsihh1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgjn24s4gaevyebsihh1.png" alt="State Machine with Call 3rd party API state" width="264" height="231"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Testing
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;As soon as the stack is deployed, you can see there is a secret created in Secret Manager with the given client id, client secret and newly generated auth token.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You can initialize a Step Functions execution with below input.&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"set401"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;&lt;p&gt;You can see it successfully completes and the output includes the return value from the API Lambda function. Also, in the Lambda logs, you can see the auth token from the secret is being used in the header as the Bearer token.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;To simulate a 401 error, initialize a Step Functions execution with below input:&lt;br&gt;
&lt;/p&gt;

&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"set401"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;




&lt;/li&gt;

&lt;li&gt;&lt;p&gt;With this input, the API Lambda function, will return a 401 error for the initial attempt. And the state will immediately start retrying. And the step will be successful in the retries.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;If you check the secret in the Secret Manager, you will be able to see the token has updated with new value. Also, if you check the API Lambda's logs, you can see the request headers include different Bearer token values in the initial and the retry attempts.&lt;/p&gt;&lt;/li&gt;

&lt;/ol&gt;

&lt;h2&gt;
  
  
  Express vs Standard workflows
&lt;/h2&gt;

&lt;p&gt;This feature works well in both Express and Standard State Machine types. However, it is wise to keep in mind that (even generally express workflows are cost effective compared to the standard workflows) cost of the express workflows depend on the execution duration too. &lt;/p&gt;

&lt;p&gt;So, if your APIs are slow, it will cost more. For Standard type, it will be a single state transition, so the time it takes to call the API will not be a concern for pricing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The introduction of HTTP integration support in AWS Step Functions is a significant enhancement that simplifies the process of invoking external APIs. This feature not only reduces the custom Lambda code needed for API calls but also eliminates the necessity for custom code to handle complex retry logic when generating and renewing authentication tokens.&lt;/p&gt;

&lt;p&gt;With this capability, developers can seamlessly utilise third-party APIs with minimal code, and enhance the overall efficiency of workflow orchestration in AWS Step Functions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Step Functions external endpoints AWS Blog: &lt;a href="https://aws.amazon.com/blogs/aws/external-endpoints-and-testing-of-task-states-now-available-in-aws-step-functions/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/aws/external-endpoints-and-testing-of-task-states-now-available-in-aws-step-functions/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Step Functions external endpoints documentation: &lt;a href="https://docs.aws.amazon.com/step-functions/latest/dg/connect-third-party-apis.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/step-functions/latest/dg/connect-third-party-apis.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;EventBridge Connection documentation: &lt;a href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-api-destinations.html#eb-api-destination-connection" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-api-destinations.html#eb-api-destination-connection&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>stepfunctions</category>
      <category>eventdriven</category>
    </item>
    <item>
      <title>3 ways to catch all the events going through the EventBridge Event Bus</title>
      <dc:creator>Pubudu Jayawardana</dc:creator>
      <pubDate>Wed, 01 Nov 2023 20:49:11 +0000</pubDate>
      <link>https://forem.com/aws-builders/3-ways-to-catch-all-the-events-going-through-the-eventbridge-event-bus-aja</link>
      <guid>https://forem.com/aws-builders/3-ways-to-catch-all-the-events-going-through-the-eventbridge-event-bus-aja</guid>
      <description>&lt;p&gt;For some requirements, you will need to record all the events that go through your EventBridge Event Bus. CloudWatch can be a suitable target for this. &lt;a href="https://repost.aws/knowledge-center/cloudwatch-log-group-eventbridge" rel="noopener noreferrer"&gt;https://repost.aws/knowledge-center/cloudwatch-log-group-eventbridge&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this blog post, I am going to discuss how we can implement 3 different rules that can be used to implement catch-all functionality for a EventBridge event bus.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using prefix
&lt;/h2&gt;

&lt;p&gt;You can use the “prefix” pattern matching feature of the EventBridge rule to capture all the events.&lt;br&gt;
Keeping the prefix value to an empty string will do the trick.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EventRuleCatchAllWithPrefix:
    Type: AWS::Events::Rule
    Properties:
      Description: "EventRule to catch all using prefix"
      EventBusName: !Ref MyEventBus
      EventPattern:
        source:
          - prefix: ""
      Targets:
        - Arn: !GetAtt CatchAllWithPrefixLogGroup.Arn
          Id: "TargetCatchAllWithPrefix"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Refers docs of prefix pattern here: &lt;a href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-event-patterns-content-based-filtering.html#eb-filtering-prefix-matching" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-event-patterns-content-based-filtering.html#eb-filtering-prefix-matching&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this example, I have used the field “source” to apply the prefix filter since each and every event going through an event bus will have a source field. As an alternative, you may use any field that exists in the event.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using version
&lt;/h2&gt;

&lt;p&gt;Similar to the prefix matching, we can exactly match the “version” of the event to capture all the events. For all the events going through Event Bus will include the version field with value 0. As of now, there are no other version values available other than 0, but this might change in future.&lt;/p&gt;

&lt;p&gt;Here is an example how you can define the catch all rule with exactly matching the version.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;EventRuleCatchAllWithVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Events::Rule&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EventRule&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;catch&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;using&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;version"&lt;/span&gt;
      &lt;span class="na"&gt;EventBusName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;MyEventBus&lt;/span&gt;
      &lt;span class="na"&gt;EventPattern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;Targets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Arn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;CatchAllWithVersionLogGroup.Arn&lt;/span&gt;
          &lt;span class="na"&gt;Id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TargetCatchAllWithVersion"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Using wildcard
&lt;/h2&gt;

&lt;p&gt;EventBridge recently announced the support for wildcards in their event rules (&lt;a href="https://aws.amazon.com/about-aws/whats-new/2023/10/amazon-eventbridge-wildcard-filters-rules/" rel="noopener noreferrer"&gt;https://aws.amazon.com/about-aws/whats-new/2023/10/amazon-eventbridge-wildcard-filters-rules/&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;We can use this to form the catch all rule as follows. Use any field that exists in the event all the time (here, the “source” field) and apply the wildcard “*”.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;EventRuleCatchAllWithWildcard&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Events::Rule&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EventRule&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;catch&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;using&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;wildcard"&lt;/span&gt;
      &lt;span class="na"&gt;EventBusName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;MyEventBus&lt;/span&gt;
      &lt;span class="na"&gt;EventPattern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;wildcard&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
      &lt;span class="na"&gt;Targets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Arn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;CatchAllWithWildcardLogGroup.Arn&lt;/span&gt;
          &lt;span class="na"&gt;Id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TargetCatchAllWithWildcard"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Refer docs of wildcard pattern here: &lt;a href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-event-patterns-content-based-filtering.html#eb-filtering-wildcard-matching" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-event-patterns-content-based-filtering.html#eb-filtering-wildcard-matching&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Try this yourself
&lt;/h2&gt;

&lt;p&gt;Here is the Github repository I created to show this functionality. You can deploy this into your AWS environment using AWS SAM CLI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/pubudusj/eventbridge-catch-all" rel="noopener noreferrer"&gt;https://github.com/pubudusj/eventbridge-catch-all&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once deployed, it will create an Event Bus, 3 different rules as discussed above and 3 different CloudWatch Logs as targets for those rules.&lt;/p&gt;

&lt;p&gt;When you send any message into the event bus, you can see them end up in all the CloudWatch Logs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;AWS is well known for providing more than one method to achieve the same results. This is one of the examples, where you can implement a catch all functionality for your event bus defining rules in 3 different ways.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>eventbridge</category>
      <category>eventdriven</category>
    </item>
  </channel>
</rss>
