<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Håkon Eriksen Drange</title>
    <description>The latest articles on Forem by Håkon Eriksen Drange (@haakoned).</description>
    <link>https://forem.com/haakoned</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2938543%2Fa2dc2bdf-a5af-42f7-949a-a8b5834861a9.jpg</url>
      <title>Forem: Håkon Eriksen Drange</title>
      <link>https://forem.com/haakoned</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/haakoned"/>
    <language>en</language>
    <item>
      <title>AWS Immersion Day talk: Web Application Firewall and AWS Shield Advanced</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Tue, 25 Nov 2025 20:17:00 +0000</pubDate>
      <link>https://forem.com/haakoned/aws-immersion-day-talk-web-application-firewall-and-aws-shield-advanced-38fk</link>
      <guid>https://forem.com/haakoned/aws-immersion-day-talk-web-application-firewall-and-aws-shield-advanced-38fk</guid>
      <description>&lt;p&gt;I was invited by AWS Norway to contribute to their AWS Immersion Day on Zero Trust which took place on Tuesday November 25th 2025.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tbidmt2qdi6t6i7vrli.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tbidmt2qdi6t6i7vrli.png" width="648" height="393"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Elshan Hasanov from AWS started with an introduction to the concept of Zero Trust and how Amazon Verified Access and Amazon VPC Lattice can be used for the purpose. Next up I presented approaches for application and infrastructure security with AWS Web Application Firewall and AWS Shield Advanced.&lt;/p&gt;

&lt;p&gt;You can find the slides below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://hedrange.com/wp-content/uploads/2025/12/2025-11-25-aws-immersion-day-zero-trust-waf-shield-haakon-eriksen-drange.pdf" rel="noopener noreferrer"&gt;2025-11-25-aws-immersion-day-zero-trust-waf-shield-haakon-eriksen-drange&lt;/a&gt;&lt;a href="https://hedrange.com/wp-content/uploads/2025/12/2025-11-25-aws-immersion-day-zero-trust-waf-shield-haakon-eriksen-drange.pdf" rel="noopener noreferrer"&gt;Download&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About the event&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
In today’s dynamic and distributed cloud environments, traditional perimeter security alone is no longer suﬃcient to protect against advanced threats. This session explores the paradigm shift towards Zero Trust architecture and its implementation within AWS ecosystems, covering the use of AWS services and features to adopt Zero Trust principles alongside traditional security approaches for fine-grained access control, network segmentation, secure data access, and logging. Join us to learn how to transition to a more secure, resilient, and comprehensive security approach tailored for your AWS environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to Expect from the Event&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
This session and hands-on workshop provides an in-depth exploration of Zero Trust security principles and their practical application within AWS environments using AWS Verified Access and VPC Lattice, complemented by modern perimeter security strategies including WAF and firewall inspection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Core Focus Areas&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identity-aware, least-privilege access for both human users and microservices&lt;/li&gt;
&lt;li&gt;Integration of Zero Trust with strategic perimeter controls including WAF&lt;/li&gt;
&lt;li&gt;Practical implementation using AWS Verified Access and Amazon VPC Lattice&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hands-on Components&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Configure AWS Verified Access for secure remote user access&lt;/li&gt;
&lt;li&gt;Implement Amazon VPC Lattice for service-to-service communication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Original AWS Experience North event page:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://hedrange.com/wp-content/uploads/2025/12/2025-11-25-zero-trust-event-page.pdf" rel="noopener noreferrer"&gt;2025-11-25-zero-trust-event-page&lt;/a&gt;&lt;a href="https://hedrange.com/wp-content/uploads/2025/12/2025-11-25-zero-trust-event-page.pdf" rel="noopener noreferrer"&gt;Download&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2025/11/25/aws-immersion-day-talk-web-application-firewall-and-aws-shield-advanced/" rel="noopener noreferrer"&gt;AWS Immersion Day talk: Web Application Firewall and AWS Shield Advanced&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com/" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
    </item>
    <item>
      <title>AWS User Group Oslo talk: From Vibe Coding to Spec-Driven Development with Kiro</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Wed, 19 Nov 2025 13:39:13 +0000</pubDate>
      <link>https://forem.com/haakoned/aws-user-group-oslo-talk-from-vibe-coding-to-spec-driven-development-with-kiro-3n7p</link>
      <guid>https://forem.com/haakoned/aws-user-group-oslo-talk-from-vibe-coding-to-spec-driven-development-with-kiro-3n7p</guid>
      <description>&lt;p&gt;On &lt;a href="https://www.meetup.com/aws-user-group-norway/events/311318914/" rel="noopener noreferrer"&gt;Tuesday November 18th at the AWS User Group Oslo Meetup&lt;/a&gt; I shared my views on why builders should pivot from Vibe Coding to Spec-Driven Development with Kiro.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Highlights&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How AI assistants are transforming the Software Development Lifecycle (all code, not only frontend and backend, Infrastructure-as-Code as well)&lt;/li&gt;
&lt;li&gt;Introduction to Vibe Coding, the good and the not so good parts&lt;/li&gt;
&lt;li&gt;How traditional and established software craftmanship is becoming more relevant than ever&lt;/li&gt;
&lt;li&gt;How Kiro and the Spec-Driven Development workflow helps bring more structure, quality and speed&lt;/li&gt;
&lt;li&gt;A practical demonstration; adding a new feature to a serverless weather forecast application, deployed on AWS with Terraform and GitHub Actions&lt;/li&gt;
&lt;li&gt;Best practices for specs, agent steering context, MCP and token optimization based on my experience&lt;/li&gt;
&lt;li&gt;Predictions for the future&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can find the slides below. Thanks to AWS Community Builders and the Amazon Kiro team for valuable insights and supporting material.&lt;/p&gt;

&lt;p&gt;For a detailed walk-through check out &lt;a href="https://hedrange.com/2025/08/11/how-to-use-kiro-for-ai-assisted-spec-driven-development/" rel="noopener noreferrer"&gt;https://hedrange.com/2025/08/11/how-to-use-kiro-for-ai-assisted-spec-driven-development/&lt;/a&gt; .&lt;/p&gt;

&lt;p&gt;&lt;a href="https://hedrange.com/wp-content/uploads/2025/11/2025-11-18-aws-user-group-oslo-from-vibe-coding-to-spec-driven-development-with-kiro.pdf" rel="noopener noreferrer"&gt;2025-11-18-aws-user-group-oslo-from-vibe-coding-to-spec-driven-development-with-kiro&lt;/a&gt;&lt;a href="https://hedrange.com/wp-content/uploads/2025/11/2025-11-18-aws-user-group-oslo-from-vibe-coding-to-spec-driven-development-with-kiro.pdf" rel="noopener noreferrer"&gt;Download&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcph4d6sbalserxv11o0m.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcph4d6sbalserxv11o0m.webp" width="800" height="602"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frgmxkpdarbxt4l6mhigz.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frgmxkpdarbxt4l6mhigz.jpeg" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjl8d5h0c4v3e0koh885o.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjl8d5h0c4v3e0koh885o.jpeg" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcyoh86jvgbcglqz3zqoy.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcyoh86jvgbcglqz3zqoy.webp" width="768" height="1024"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Additional resources
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://kiro.dev/" rel="noopener noreferrer"&gt;https://kiro.dev/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kiro.dev/blog/general-availability/" rel="noopener noreferrer"&gt;https://kiro.dev/blog/general-availability/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2025/11/19/aws-user-group-oslo-talk-going-from-vibe-coding-to-spec-driven-development-with-kiro/" rel="noopener noreferrer"&gt;AWS User Group Oslo talk: From Vibe Coding to Spec-Driven Development with Kiro&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com/" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>aws</category>
      <category>kiro</category>
      <category>specdrivendevelopmen</category>
    </item>
    <item>
      <title>Demonstrate practical AWS skills with new microcredentials</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Fri, 14 Nov 2025 07:58:42 +0000</pubDate>
      <link>https://forem.com/haakoned/demonstrate-practical-aws-skills-with-new-microcredentials-3918</link>
      <guid>https://forem.com/haakoned/demonstrate-practical-aws-skills-with-new-microcredentials-3918</guid>
      <description>&lt;p&gt;AWS has announced a new skills validation program called &lt;a href="https://skillbuilder.aws/category/type/microcredentials" rel="noopener noreferrer"&gt;Microcredentials&lt;/a&gt;. These are more practical and lightweight approaches for validating knowledge than a full, comprehensive certification exam. It’s rewarding to be able to go beyond theoretical knowledge and prove what you’ve actually learned. AWS Certifications and Microcredentials complement each other; validating both your deep technical knowledge and hands-on skills.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8i5455v8l0o65wcefo3o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8i5455v8l0o65wcefo3o.png" alt="Image demonstrating AWS microcredential lab overview" width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image credit: AWS – &lt;a href="https://www.aboutamazon.com/news/aws/aws-ai-certification-learning-tools-skills-development" rel="noopener noreferrer"&gt;https://www.aboutamazon.com/news/aws/aws-ai-certification-learning-tools-skills-development&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  This is how it works
&lt;/h2&gt;

&lt;p&gt;The exam labs are in a live AWS-provisioned environment, similar to regular lab/SimuLearn tasks on AWS Skill Builder. During the course of 90 minutes candidates are presented with a set of challenges to be achieved by practical implementation in the AWS Console (ClickOps, no IaC). You will have to diagnose issues and implement solutions on your own, there are no hints or guidance provided. The exam lab cannot be paused or restarted, so if you have to quit you need to start over again. Not much different than an actual certification exam. Candidates failing to meet the passing score objective can go for a re-take after 25 days.&lt;/p&gt;

&lt;p&gt;As of November 2025 the available training options are:&lt;/p&gt;

&lt;h3&gt;
  
  
  Microcredential Preview Experience
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;This microcredential validates your ability to configure and connect basic AWS services in hands-on scenarios. Key focus areas include S3 static hosting, API Gateway connections, Lambda integrations, and DynamoDB storage. This is not a real microcredential exam lab. This is a trial version you can use to familiarize yourself with the interface.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I recommend you start here to become familiar with the concept to set yourself up for success for the actual lab exam.&lt;/p&gt;

&lt;p&gt;AWS Skill Builder link: &lt;a href="https://skillbuilder.aws/learn/JBTFY8M6S8/microcredential-preview-experience/YJFR7KHKR3" rel="noopener noreferrer"&gt;https://skillbuilder.aws/learn/JBTFY8M6S8/microcredential-preview-experience/YJFR7KHKR3&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Agentic AI Demonstrated
&lt;/h3&gt;

&lt;p&gt;_ &lt;strong&gt;AWS Agentic AI Demonstrated&lt;/strong&gt; is a hands-on exam lab designed to help you validated your AWS skills in the &lt;strong&gt;Agentic AI&lt;/strong&gt; domain_.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Objectives&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Troubleshoot and repair supervisor and specialist Bedrock Agents&lt;/li&gt;
&lt;li&gt;Troubleshoot and repair an Amazon Bedrock knowledge base&lt;/li&gt;
&lt;li&gt;Integrate and fix a Bedrock Agent&lt;/li&gt;
&lt;li&gt;Enhance Bedrock Agent capabilities&lt;/li&gt;
&lt;li&gt;Integrate Bedrock Guardrails with a Bedrock Agent&lt;/li&gt;
&lt;li&gt;Connect a web application chat client with a Bedrock Agent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AWS Skill Builder link: &lt;a href="https://skillbuilder.aws/learn/32Y249P272/aws-agentic-ai-demonstrated/TTAJ5WKYTS" rel="noopener noreferrer"&gt;https://skillbuilder.aws/learn/32Y249P272/aws-agentic-ai-demonstrated/TTAJ5WKYTS&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Serverless Demonstrated
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AWS Serverless Demonstrated&lt;/strong&gt; is a hands-on exam lab designed to help you validate your AWS skills in the &lt;strong&gt;Serverless&lt;/strong&gt; domain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Objectives&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Configure an AWS Lambda function&lt;/li&gt;
&lt;li&gt;Deploy a REST API&lt;/li&gt;
&lt;li&gt;Configure a Step Functions state machine&lt;/li&gt;
&lt;li&gt;Design and implement an event-driven system&lt;/li&gt;
&lt;li&gt;Optimize AWS Lambda functions for various scenarios&lt;/li&gt;
&lt;li&gt;Configure a CI/CD pipeline for serverless applications&lt;/li&gt;
&lt;li&gt;Configure and analyze monitoring and telemetry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AWS Skill Builder Link: &lt;a href="https://skillbuilder.aws/learn/XV3B4RGA8Q/aws-serverless-demonstrated/BYD5SH8R5C" rel="noopener noreferrer"&gt;https://skillbuilder.aws/learn/XV3B4RGA8Q/aws-serverless-demonstrated/BYD5SH8R5C&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Tip: Completing the &lt;a href="https://skillbuilder.aws/learning-plan/VD4SU58H3W/serverless--knowledge-badge-readiness-path-includes-labs/W62PSCYRZF" rel="noopener noreferrer"&gt;Serverless Knowledge Badge Readiness Path course&lt;/a&gt; first can be helpful (you will also get a badge upon completing a multiple-choice assessment).&lt;/p&gt;

&lt;h2&gt;
  
  
  Result
&lt;/h2&gt;

&lt;p&gt;Passing the microcredential lab exams will unlock some nice new Credly badges you can share with your manager, colleagues and on social media to prove the skills you’ve acquired.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6nnpsbilllpcbldktb5j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6nnpsbilllpcbldktb5j.png" width="800" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;My reference: &lt;a href="https://www.credly.com/badges/bc214808-641d-4fb2-b196-e78b530af563" rel="noopener noreferrer"&gt;https://www.credly.com/badges/bc214808-641d-4fb2-b196-e78b530af563&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2r66tgi0ox5kvjssi7v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2r66tgi0ox5kvjssi7v.png" width="800" height="263"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;My reference: &lt;a href="https://www.credly.com/badges/80b6282b-026e-482e-bc14-9aa801536435" rel="noopener noreferrer"&gt;https://www.credly.com/badges/80b6282b-026e-482e-bc14-9aa801536435&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Good luck!&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2025/11/14/demonstrate-practical-aws-skills-with-new-microcredentials/" rel="noopener noreferrer"&gt;Demonstrate practical AWS skills with new microcredentials&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com/" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>aws</category>
      <category>microcredentials</category>
    </item>
    <item>
      <title>How to use Kiro for AI assisted spec-driven development</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Mon, 11 Aug 2025 13:35:31 +0000</pubDate>
      <link>https://forem.com/haakoned/how-to-use-kiro-for-ai-assisted-spec-driven-development-2mpa</link>
      <guid>https://forem.com/haakoned/how-to-use-kiro-for-ai-assisted-spec-driven-development-2mpa</guid>
      <description>&lt;p&gt;Read on to learn how to use Kiro for AI assisted spec-driven development of a serverless weather forecasting app, using Terraform for deployment to AWS.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Table of contents&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Kiro introduces the AI assisted spec-driven development workflow&lt;/li&gt;
&lt;li&gt;
Kiro core concepts

&lt;ul&gt;
&lt;li&gt;Core capabilities&lt;/li&gt;
&lt;li&gt;Specs: Plan and build features using structured specifications&lt;/li&gt;
&lt;li&gt;Hooks: Automate repetitive tasks with intelligent triggers&lt;/li&gt;
&lt;li&gt;Agentic chat: Build features through natural conversation with AI&lt;/li&gt;
&lt;li&gt;Steering: Guide AI with custom rules and context&lt;/li&gt;
&lt;li&gt;MCP Servers: Connect external tools and data sources&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Getting Kiro&lt;/li&gt;

&lt;li&gt;

Starting your first Kiro project

&lt;ul&gt;
&lt;li&gt;Vibe&lt;/li&gt;
&lt;li&gt;Spec&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Setting up Kiro

&lt;ul&gt;
&lt;li&gt;Model Context Protocol (MCP) servers&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Setting up agent steering context&lt;/li&gt;

&lt;li&gt;

Writing your product spec

&lt;ul&gt;
&lt;li&gt;Step 1: Define requirements&lt;/li&gt;
&lt;li&gt;Step 2: Define design&lt;/li&gt;
&lt;li&gt;Step 3: Implement&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Deploying the final solution

&lt;ul&gt;
&lt;li&gt;Workflow for adding a new feature&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Learnings and key takeaways

&lt;ul&gt;
&lt;li&gt;Reflections on context&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Kiro pricing&lt;/li&gt;

&lt;li&gt;Conclusion&lt;/li&gt;

&lt;li&gt;Resources&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Since the advent of Generative AI, coding assistants and their evolution has been a topic of much discussion. As the Large Language Models became increasingly more precise, the AI-based coding assistants or companions have been able to produce increasingly more relevant suggestions, which has been incorporated into extensions and new IDEs such as Cursor in addition to the CLI. We’ve evolved from simple suggestions of a few lines to transformation of complete codebases. The focus is pivoting from &lt;em&gt;assistance&lt;/em&gt; to &lt;em&gt;resolution&lt;/em&gt; of a particular problem and &lt;em&gt;outcomes&lt;/em&gt;, what we instruct the software to achieve. &lt;a href="https://en.wikipedia.org/wiki/Vibe_coding" rel="noopener noreferrer"&gt;Vibe Coding&lt;/a&gt; will probably go down in history as one of the main terms of 2025. It can be efficient for prototypes and proof-of-concepts, but how can we know what assumptions and decisions the agent made to get to that result?&lt;/p&gt;

&lt;p&gt;Traditional software development processes are based on initial specification. We need to know the purpose and functionality of what to build before we starting building.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/context-project-rules.html" rel="noopener noreferrer"&gt;Project Rules&lt;/a&gt; and &lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/customizations.html" rel="noopener noreferrer"&gt;Customizations&lt;/a&gt; in &lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/what-is.html" rel="noopener noreferrer"&gt;Amazon Q Developer&lt;/a&gt; were a step in the right direction, but some key challenges I experienced with AI coding assistants earlier in 2025:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Don’t remember context or state. If you shut down your laptop and continue tomorrow the context may be lost (was improved with Q Developer)&lt;/li&gt;
&lt;li&gt;How to share context across multiple developers in a team&lt;/li&gt;
&lt;li&gt;How to get more valuable output according to personal/company preference&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://kiro.dev/blog/understanding-kiro-pricing-specs-vibes-usage-tracking/" rel="noopener noreferrer"&gt;Research referenced by AWS&lt;/a&gt; shows that addressing issues during the development phase is &lt;a href="https://www.cs.cmu.edu/afs/cs/academic/class/17654-f01/www/refs/BB.pdf" rel="noopener noreferrer"&gt;5&lt;/a&gt; to &lt;a href="https://www.researchgate.net/figure/BM-System-Science-Institute-Relative-Cost-of-Fixing-Defects_fig1_255965523" rel="noopener noreferrer"&gt;7&lt;/a&gt; times more costly than resolving them during the planning phase of the software development lifecycle. Similarly, it’s less complex and costly to change a system before going to production.&lt;/p&gt;

&lt;p&gt;This principle holds true even with AI coding assistants. When you take the time to discuss requirements and design with Kiro during the planning phase, a single specification request will often accomplish what would otherwise require multiple vibe iterations during implementation. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Garbage_in,_garbage_out" rel="noopener noreferrer"&gt;Garbage in, garbage out&lt;/a&gt;, you know.&lt;/p&gt;

&lt;p&gt;Luckily, Kiro can now help incorporate that well-known structure into AI assistant coding, in a consistent manner.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kiro introduces the AI assisted spec-driven development workflow
&lt;/h2&gt;

&lt;p&gt;Kiro is a new software development IDE based on Visual Studio Code that turns prompts into clear requirements, structured designs and implementation tasks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9lfb3fa7dkrux3xhi80.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9lfb3fa7dkrux3xhi80.png" width="800" height="206"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The whole process is validated by tests. Code is generated by revolutionary AI agents utilizing the latest and most up-to-date Large Language Models (&lt;a href="https://en.wikipedia.org/wiki/Large_language_model" rel="noopener noreferrer"&gt;LLM&lt;/a&gt;s).&lt;/p&gt;

&lt;p&gt;Kiro leverages &lt;a href="https://www.anthropic.com/claude/sonnet" rel="noopener noreferrer"&gt;Anthropic’s Claude Sonnet 4&lt;/a&gt; under the hood, with the option to fall back to 3.7 (prefer the newest one). These models are specialized in agentic coding and tasks across the entire software development lifecycle from initial planning, implementation, bug fixing, maintenance and refactoring.&lt;/p&gt;

&lt;p&gt;I recommend reading &lt;a href="https://kiro.dev/blog/introducing-kiro/" rel="noopener noreferrer"&gt;Introducing Kiro&lt;/a&gt; and &lt;a href="https://kiro.dev/blog/kiro-and-the-future-of-software-development/" rel="noopener noreferrer"&gt;Kiro and the future of AI spec-driven software development&lt;/a&gt; to get up to speed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kiro core concepts
&lt;/h2&gt;

&lt;p&gt;Kiro introduces two main modes: Vibe and Spec.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flgi2ge5p7h5tctlwe2vn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flgi2ge5p7h5tctlwe2vn.png" width="800" height="451"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image courtesy of Kiro&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Core capabilities
&lt;/h3&gt;

&lt;p&gt;You can read more about my experiences with these capabilities in the next chapter. Let me introduce the concepts first.&lt;/p&gt;
&lt;h4&gt;
  
  
  Specs: Plan and build features using structured specifications
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;Specs or specifications are structured artifacts that formalize the development process for complex features in your application. They provide a systematic approach to transform high-level ideas into detailed implementation plans with clear tracking and accountability.&lt;/p&gt;

&lt;p&gt;With Kiro’s specs, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Break down requirements&lt;/strong&gt;  into user stories with acceptance criteria&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build design docs&lt;/strong&gt;  with sequence diagrams and architecture plans&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Track implementation progress&lt;/strong&gt;  across discrete tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collaborate effectively&lt;/strong&gt;  between product and engineering teams&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://kiro.dev/docs/specs/" rel="noopener noreferrer"&gt;https://kiro.dev/docs/specs/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4&gt;
  
  
  Hooks: Automate repetitive tasks with intelligent triggers
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;Agent Hooks are automated triggers that execute predefined agent actions when specific events occur in the Kiro IDE. When files are created, saved or deleted you can configure hooks to be run for common tasks to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maintain consistent code quality&lt;/li&gt;
&lt;li&gt;Prevent security vulnerabilities&lt;/li&gt;
&lt;li&gt;Reduce manual overhead&lt;/li&gt;
&lt;li&gt;Standardize team processes&lt;/li&gt;
&lt;li&gt;Create faster development cycles&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://kiro.dev/docs/hooks/" rel="noopener noreferrer"&gt;https://kiro.dev/docs/hooks/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is an area I have not explored in detail yet.&lt;/p&gt;
&lt;h4&gt;
  
  
  Agentic chat: Build features through natural conversation with AI
&lt;/h4&gt;

&lt;p&gt;Kiro offers a &lt;a href="https://kiro.dev/docs/chat/" rel="noopener noreferrer"&gt;chat&lt;/a&gt; panel where you can interact with your code through natural language conversations. Just tell Kiro what you need. Ask questions about your codebase, request explanations for complex logic, generate new features, debug tricky issues, and automate repetitive tasks—all while Kiro maintains complete context of your project.&lt;/p&gt;
&lt;h4&gt;
  
  
  Steering: Guide AI with custom rules and context
&lt;/h4&gt;

&lt;p&gt;I believe this is one of the most powerful capabilities Kiro introduces. To quote the &lt;a href="https://kiro.dev/docs/" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Steering gives Kiro persistent knowledge about your project through markdown files in &lt;code&gt;.kiro/steering/&lt;/code&gt;. Instead of explaining your conventions in every chat, steering files ensure Kiro consistently follows your established patterns, libraries, and standards.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consistent Code Generation&lt;/strong&gt;  – Every component, API endpoint, or test follows your team’s established patterns and conventions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduced Repetition&lt;/strong&gt;  – No need to explain project standards in each conversation. Kiro remembers your preferences.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Team Alignment&lt;/strong&gt;  – All developers work with the same standards, whether they’re new to the project or seasoned contributors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable Project Knowledge&lt;/strong&gt;  – Documentation that grows with your codebase, capturing decisions and patterns as your project evolves.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://kiro.dev/docs/steering/" rel="noopener noreferrer"&gt;https://kiro.dev/docs/steering/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4&gt;
  
  
  MCP Servers: Connect external tools and data sources
&lt;/h4&gt;

&lt;p&gt;In my opinion the main inputs for valuable and tailored results is your combination of Steering and Model Context Protocol servers. MCP extends Kiro’s capabilities by connecting to specialized servers that provide additional tools and context, tailored to your environment.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;MCP is a protocol that allows Kiro to communicate with external servers to access specialized tools and information. For example, the AWS Documentation MCP server provides tools to search, read, and get recommendations from AWS documentation directly within Kiro.&lt;/p&gt;

&lt;p&gt;With MCP, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Access specialized knowledge bases and documentation&lt;/li&gt;
&lt;li&gt;Integrate with external services and APIs&lt;/li&gt;
&lt;li&gt;Extend Kiro’s capabilities with domain-specific tools&lt;/li&gt;
&lt;li&gt;Create custom tools for your specific workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://kiro.dev/docs/mcp/" rel="noopener noreferrer"&gt;https://kiro.dev/docs/mcp/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Getting Kiro
&lt;/h2&gt;

&lt;p&gt;As of early August 2025 Kiro is still in public preview with limited availability and a waiting list.&lt;/p&gt;

&lt;p&gt;Assuming you have been able to get Kiro from &lt;a href="https://kiro.dev/" rel="noopener noreferrer"&gt;https://kiro.dev/&lt;/a&gt; go through their &lt;a href="https://kiro.dev/docs/getting-started/" rel="noopener noreferrer"&gt;Get started guide&lt;/a&gt; to learn more about the basic concepts.&lt;/p&gt;
&lt;h2&gt;
  
  
  Starting your first Kiro project
&lt;/h2&gt;

&lt;p&gt;Open a new folder to start a new project and you are presented with the option build Vibe or Spec style.&lt;/p&gt;
&lt;h4&gt;
  
  
  Vibe
&lt;/h4&gt;

&lt;p&gt;Chat first, then build. Explore ideas and iterate as you discover needs.&lt;/p&gt;

&lt;p&gt;Great for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rapid exploration and testing&lt;/li&gt;
&lt;li&gt;Building when requirements are unclear&lt;/li&gt;
&lt;li&gt;Implementing a task&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  Spec
&lt;/h4&gt;

&lt;p&gt;Plan first, then build. Create requirements and design before coding starts.&lt;/p&gt;

&lt;p&gt;Great for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Thinking through feature in-dept&lt;/li&gt;
&lt;li&gt;Projects needing upfront planning&lt;/li&gt;
&lt;li&gt;Building features in a structured way&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7tpghs4yhl3rw1dw77ia.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7tpghs4yhl3rw1dw77ia.png" width="800" height="623"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In my case I’ve used Amazon Q Developer in VS Code and CLI quite a lot for coding acceleration, so I went directly for Spec. Let’s get back to working with specifications after we have configured the remaining parts of our Kiro workspace.&lt;/p&gt;
&lt;h2&gt;
  
  
  Setting up Kiro
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Model Context Protocol (MCP) servers
&lt;/h3&gt;

&lt;p&gt;Enriching your environment with relevant MCP servers can be a massive boost. Take a look at the official MCP Servers from AWS at &lt;a href="https://github.com/awslabs/mcp" rel="noopener noreferrer"&gt;https://github.com/awslabs/mcp&lt;/a&gt; , there is already a ton available.&lt;/p&gt;

&lt;p&gt;The ones I currently enjoy in Kiro and Amazon Q Developer focusing on Terraform development and AWS are:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqyyfkpdftuqjcc7koc4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqyyfkpdftuqjcc7koc4.png" width="348" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/awslabs/mcp/blob/main/src/core-mcp-server" rel="noopener noreferrer"&gt;AWS Core&lt;/a&gt; provides tools for prompt understanding and translation to AWS services&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/awslabs/mcp/blob/main/src/aws-documentation-mcp-server" rel="noopener noreferrer"&gt;AWS Docs&lt;/a&gt; and AWS Knowledge can read, search and recommend from the official, up-to-date AWS Documentation. &lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/awslabs/mcp/blob/main/src/aws-api-mcp-server" rel="noopener noreferrer"&gt;AWS API&lt;/a&gt; can suggest for you and call AWS CLI commands on your behalf.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/awslabs/mcp/blob/main/src/cdk-mcp-server" rel="noopener noreferrer"&gt;AWS CDK&lt;/a&gt; can provide guidance and generate CDK stacks.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/awslabs/mcp/blob/main/src/terraform-mcp-server" rel="noopener noreferrer"&gt;AWS Terraform&lt;/a&gt; can search AWS and AWSCC provider docs, Execute Terraform and Terragrunt Commands, run Checkov scans and search user provided Terraform registry modules. Super valuable!&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/awslabs/mcp/blob/main/src/aws-serverless-mcp-server" rel="noopener noreferrer"&gt;AWS Serverless&lt;/a&gt; can provide guidance, search schemas, deploy serverless applications, get metrics and so on.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/awslabs/mcp/blob/main/src/aws-diagram-mcp-server" rel="noopener noreferrer"&gt;AWS Diagram&lt;/a&gt; get generate diagrams, get diagram examples and list icons. You can generate architecture diagrams with official AWS icons, flow and sequence charts and so on. Content can be provided as a static image or in Draw.IO XML format, so that you can finishing up the final touches and corrections yourself.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  - &lt;a href="https://github.com/awslabs/mcp/blob/main/src/aws-pricing-mcp-server" rel="noopener noreferrer"&gt;AWS Pricing&lt;/a&gt; can analyze CDK and Terraform projects, query the official pricing API and generate cost reports, much more efficient than manually working with the AWS Cost Calculator.
&lt;/h2&gt;

&lt;p&gt;The MCP configuration feature in Kiro supports two modes: User Config (global) and Workspace Config.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopyg6t449p54el7i1rby.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopyg6t449p54el7i1rby.png" width="255" height="111"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Feature&lt;/strong&gt; | &lt;strong&gt;User Config&lt;/strong&gt; | &lt;strong&gt;Workspace Config&lt;/strong&gt; |&lt;br&gt;
| Config file location | ~/.kiro/settings.mcp.json | my-kiro-project/.kiro/settings.json |&lt;br&gt;
| Application | Global, across all your Kiro projects | Local, in the current Kiro project |&lt;br&gt;
| Usage guidance | Keep common Kiro context configuration on your global system. | Keep all Kiro project context configuration within the project. |&lt;br&gt;
| | | |&lt;/p&gt;

&lt;p&gt;Personally I prefer Workspace Config and keeping all context in the project. This makes it easier and predictable for other colleagues having the same MCP server configuration. It also yields more consistent outputs. Think about it, if not all team members have the same context settings, the results are not guaranteed to be consistent and could lead to implementation differences and bugs.&lt;/p&gt;

&lt;p&gt;Here is my current Workspace MCP Config, which I have added to a common agent-steering-bootstrap Git repo:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;{
  "mcpServers": {
    "fetch": {
      "command": "uvx",
      "args": [
        "mcp-server-fetch"
      ],
      "env": {},
      "disabled": false,
      "autoApprove": []
    },
    "aws-docs": {
      "command": "uvx",
      "args": [
        "awslabs.aws-documentation-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": [
        "search_documentation",
        "read_documentation"
      ]
    },
    "aws-core": {
      "command": "uvx",
      "args": [
        "awslabs.core-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "aws-api": {
      "command": "uvx",
      "args": [
        "awslabs.aws-api-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "aws-knowledge-mcp-server": {
      "command": "uvx",
      "args": [
        "mcp-proxy",
        "--transport",
        "streamablehttp",
        "https://knowledge-mcp.global.api.aws"
      ],
      "disabled": false,
      "autoApprove": []
    },
    "aws-cdk": {
      "command": "uvx",
      "args": [
        "awslabs.cdk-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "aws-terraform": {
      "command": "uvx",
      "args": [
        "awslabs.terraform-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "aws-serverless": {
      "command": "uvx",
      "args": [
        "awslabs.aws-serverless-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "awslabs-diagram": {
      "command": "uvx",
      "args": [
        "awslabs.aws-diagram-mcp-server"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": [
        "get_diagram_examples",
        "generate_diagram"
      ]
    },
    "awslabs-pricing": {
      "command": "uvx",
      "args": [
        "awslabs.aws-pricing-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR",
        "AWS_PROFILE": "default",
        "AWS_REGION": "eu-west-1"
      },
      "disabled": false,
      "autoApprove": [
        "get_pricing_service_codes",
        "get_pricing_service_attributes",
        "get_pricing_attribute_values",
        "get_pricing"
      ]
    }
  }
}&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "mcpServers": {
    "fetch": {
      "command": "uvx",
      "args": [
        "mcp-server-fetch"
      ],
      "env": {},
      "disabled": false,
      "autoApprove": []
    },
    "aws-docs": {
      "command": "uvx",
      "args": [
        "awslabs.aws-documentation-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": [
        "search_documentation",
        "read_documentation"
      ]
    },
    "aws-core": {
      "command": "uvx",
      "args": [
        "awslabs.core-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "aws-api": {
      "command": "uvx",
      "args": [
        "awslabs.aws-api-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "aws-knowledge-mcp-server": {
      "command": "uvx",
      "args": [
        "mcp-proxy",
        "--transport",
        "streamablehttp",
        "https://knowledge-mcp.global.api.aws"
      ],
      "disabled": false,
      "autoApprove": []
    },
    "aws-cdk": {
      "command": "uvx",
      "args": [
        "awslabs.cdk-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "aws-terraform": {
      "command": "uvx",
      "args": [
        "awslabs.terraform-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "aws-serverless": {
      "command": "uvx",
      "args": [
        "awslabs.aws-serverless-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    },
    "awslabs-diagram": {
      "command": "uvx",
      "args": [
        "awslabs.aws-diagram-mcp-server"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": [
        "get_diagram_examples",
        "generate_diagram"
      ]
    },
    "awslabs-pricing": {
      "command": "uvx",
      "args": [
        "awslabs.aws-pricing-mcp-server@latest"
      ],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR",
        "AWS_PROFILE": "default",
        "AWS_REGION": "eu-west-1"
      },
      "disabled": false,
      "autoApprove": [
        "get_pricing_service_codes",
        "get_pricing_service_attributes",
        "get_pricing_attribute_values",
        "get_pricing"
      ]
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting up agent steering context
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Steering gives Kiro persistent knowledge about your project through markdown files in &lt;code&gt;.kiro/steering/&lt;/code&gt;. Instead of explaining your conventions in every chat, steering files ensure Kiro consistently follows your established patterns, libraries, and standards.&lt;/p&gt;

&lt;p&gt;Key benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consistent Code Generation&lt;/strong&gt;  – Every component, API endpoint, or test follows your team’s established patterns and conventions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduced Repetition&lt;/strong&gt;  – No need to explain project standards in each conversation. Kiro remembers your preferences.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Team Alignment&lt;/strong&gt;  – All developers work with the same standards, whether they’re new to the project or seasoned contributors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable Project Knowledge&lt;/strong&gt;  – Documentation that grows with your codebase, capturing decisions and patterns as your project evolves.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reference: &lt;a href="https://kiro.dev/docs/steering/" rel="noopener noreferrer"&gt;https://kiro.dev/docs/steering/&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcs0n0uyoqiveqgudy5fx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcs0n0uyoqiveqgudy5fx.png" width="338" height="129"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqc2yx07n6afcab7hp2z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flqc2yx07n6afcab7hp2z.png" width="377" height="145"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;MCP agent steering contains more background information and guidance about how Kiro can leverage the active MCP servers.&lt;/p&gt;

&lt;p&gt;Product focuses on common product development context and principles.&lt;/p&gt;

&lt;p&gt;Structure focuses on how the files in your codebase are structured, to align with company standards and preferences.&lt;/p&gt;

&lt;p&gt;In Tech I define general patterns and principles for approaching solutions deployed to AWS with Terraform.&lt;/p&gt;

&lt;p&gt;Most companies have product development principles, architecture and software guidelines documented in their internal wikis. This context is crucial to get into Kiro Agent Steering context, to get outputs matching with company and team preferences.&lt;/p&gt;

&lt;p&gt;Example from tech.md:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;# Technology Stack

This document outlines the technical foundation and tooling for the project.

## Build System &amp;amp;amp; Tools
- CI/CD is out of scope of this Terraform module.

## Application Tech Stack
- Python3, Boto3, Jinja templating etc.
- Unit testing of core functionality
- Basic testing of Terraform code
- Provide /health endpoint for REST APIs
- Python operations should take place in a virtual environment where optimal Python version is installed with pyenv

## Infrastructure Tech Stack
- Terraform for Infrastructure-as-Code
- Terraform providers aws and awscc, if necessary
- Leverage community modules from https://github.com/terraform-aws-modules as relevant
- AWS Serverless architecture options are preferred for minimal operational overhead
- The AWS infrastructure is Well-Architected
- The AWS infrastructure is secure as per the latest CIS AWS Security Hub control standard
- Terraform code is unit tested Terraform's native testing framework, HCL-based tests.

## Observability
- For serverless components, AWS X-Ray is leveraged for tracing
- Logs are directed to AWS CloudWatch Logs. CloudWatch Logs groups have a retention period of 180 days. 
- A solution specific AWS Cloudwatch Dashboard which includes relevant CloudWatch metrics for reliability, performance and cost, in addition to a list over the last failing requests

### Pre-commit for Terraform
- Pre-commit is installed and leveraged for validation and formatting. 
  - terraform_fmt
  - terraform_docs in main README.md
  - check-merge-conflict
  - trailing-whitespace
  - mixed-line-ending

Example .pre-commit-config.yaml located in the root directory:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;repos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;repo: &lt;a href="https://github.com/antonbabenko/pre-commit-terraform" rel="noopener noreferrer"&gt;https://github.com/antonbabenko/pre-commit-terraform&lt;/a&gt;
rev: v1.77.3
hooks:

&lt;ul&gt;
&lt;li&gt;id: terraform_fmt&lt;/li&gt;
&lt;li&gt;id: terraform_docs
args: ["--args=--sort-by required"]&lt;/li&gt;
&lt;li&gt;id: terraform_checkov
args:

&lt;ul&gt;
&lt;li&gt;--args=--quiet&lt;/li&gt;
&lt;li&gt;--args=--download-external-modules false&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;repo: &lt;a href="https://github.com/pre-commit/pre-commit-hooks" rel="noopener noreferrer"&gt;&lt;/a&gt;&lt;a href="https://github.com/pre-commit/pre-commit-hooks" rel="noopener noreferrer"&gt;https://github.com/pre-commit/pre-commit-hooks&lt;/a&gt;
rev: v4.4.0
hooks:

&lt;ul&gt;
&lt;li&gt;id: check-merge-conflict&lt;/li&gt;
&lt;li&gt;id: trailing-whitespace
args: [--markdown-linebreak-ext=md]&lt;/li&gt;
&lt;li&gt;id: mixed-line-ending
args: ["--fix=lf"]
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
## Documentation
- AWS Labs Diagram MCP server is used to produce relevant architecture, flow and sequence diagrams, included in main README.md
- AWS Labs Pricing MCP server is used to perform a basic cost calculation of the solution, included in the main README.md
- Every task should also ensure relevant and clear documentation is created or up to date. Prefer simple and user friendly documentation, don't overcomplicate.
- All documentation follows markdown format and is stored in the `docs/` directory
- Architecture diagrams are generated programmatically using the AWS diagram MCP server
- Cost analysis documentation includes detailed breakdowns, usage projections, and optimization recommendations
- Documentation includes deployment guides, troubleshooting guides, and operational runbooks
- There should be an examples folder with README.md explaining how to include the Terraform module call in an existing CI/CD codebase.
- In documentation, provide TL;DR to make it easy and quick for developers to get up to speed. 
- In high level project documentation, include an executive summary for target group project owners, to articulate functionality and the value the solution provides.

## Cost Management
- AWS Labs Pricing MCP server provides accurate cost calculations for the infrastructure components of the solution.
- Cost analysis should include environment-specific projections (staging and production).
- Cost analysis should include AWS region comparison of eu-west-1, eu-central-1 and eu-north-1 for the production environment.
- CloudWatch cost metrics and dashboards provide real-time cost monitoring.
- A solution specific AWS Budget is deployed, based on infrastructure tag Key Service. Budget alerts prevents unexpected charges.
- Guidance is provided for the top three cost items that may increase with heavy production load. 
- Cost documentation is included in the main `README.md`

## Principles
- Favor KISS over complexity, simplicity over comprehensibility
- Respect and adopt well-known cloud based architecture and integration patterns

&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Technology Stack

This document outlines the technical foundation and tooling for the project.

## Build System &amp;amp; Tools
- CI/CD is out of scope of this Terraform module.

## Application Tech Stack
- Python3, Boto3, Jinja templating etc.
- Unit testing of core functionality
- Basic testing of Terraform code
- Provide /health endpoint for REST APIs
- Python operations should take place in a virtual environment where optimal Python version is installed with pyenv

## Infrastructure Tech Stack
- Terraform for Infrastructure-as-Code
- Terraform providers aws and awscc, if necessary
- Leverage community modules from https://github.com/terraform-aws-modules as relevant
- AWS Serverless architecture options are preferred for minimal operational overhead
- The AWS infrastructure is Well-Architected
- The AWS infrastructure is secure as per the latest CIS AWS Security Hub control standard
- Terraform code is unit tested Terraform's native testing framework, HCL-based tests.

## Observability
- For serverless components, AWS X-Ray is leveraged for tracing
- Logs are directed to AWS CloudWatch Logs. CloudWatch Logs groups have a retention period of 180 days. 
- A solution specific AWS Cloudwatch Dashboard which includes relevant CloudWatch metrics for reliability, performance and cost, in addition to a list over the last failing requests

### Pre-commit for Terraform
- Pre-commit is installed and leveraged for validation and formatting. 
  - terraform_fmt
  - terraform_docs in main README.md
  - check-merge-conflict
  - trailing-whitespace
  - mixed-line-ending

Example .pre-commit-config.yaml located in the root directory:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;repos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;repo: &lt;a href="https://github.com/antonbabenko/pre-commit-terraform" rel="noopener noreferrer"&gt;https://github.com/antonbabenko/pre-commit-terraform&lt;/a&gt;
rev: v1.77.3
hooks:

&lt;ul&gt;
&lt;li&gt;id: terraform_fmt&lt;/li&gt;
&lt;li&gt;id: terraform_docs
args: ["--args=--sort-by required"]&lt;/li&gt;
&lt;li&gt;id: terraform_checkov
args:

&lt;ul&gt;
&lt;li&gt;--args=--quiet&lt;/li&gt;
&lt;li&gt;--args=--download-external-modules false&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;repo: &lt;a href="https://github.com/pre-commit/pre-commit-hooks" rel="noopener noreferrer"&gt;&lt;/a&gt;&lt;a href="https://github.com/pre-commit/pre-commit-hooks" rel="noopener noreferrer"&gt;https://github.com/pre-commit/pre-commit-hooks&lt;/a&gt;
rev: v4.4.0
hooks:

&lt;ul&gt;
&lt;li&gt;id: check-merge-conflict&lt;/li&gt;
&lt;li&gt;id: trailing-whitespace
args: [--markdown-linebreak-ext=md]&lt;/li&gt;
&lt;li&gt;id: mixed-line-ending
args: ["--fix=lf"]
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
## Documentation
- AWS Labs Diagram MCP server is used to produce relevant architecture, flow and sequence diagrams, included in main README.md
- AWS Labs Pricing MCP server is used to perform a basic cost calculation of the solution, included in the main README.md
- Every task should also ensure relevant and clear documentation is created or up to date. Prefer simple and user friendly documentation, don't overcomplicate.
- All documentation follows markdown format and is stored in the `docs/` directory
- Architecture diagrams are generated programmatically using the AWS diagram MCP server
- Cost analysis documentation includes detailed breakdowns, usage projections, and optimization recommendations
- Documentation includes deployment guides, troubleshooting guides, and operational runbooks
- There should be an examples folder with README.md explaining how to include the Terraform module call in an existing CI/CD codebase.
- In documentation, provide TL;DR to make it easy and quick for developers to get up to speed. 
- In high level project documentation, include an executive summary for target group project owners, to articulate functionality and the value the solution provides.

## Cost Management
- AWS Labs Pricing MCP server provides accurate cost calculations for the infrastructure components of the solution.
- Cost analysis should include environment-specific projections (staging and production).
- Cost analysis should include AWS region comparison of eu-west-1, eu-central-1 and eu-north-1 for the production environment.
- CloudWatch cost metrics and dashboards provide real-time cost monitoring.
- A solution specific AWS Budget is deployed, based on infrastructure tag Key Service. Budget alerts prevents unexpected charges.
- Guidance is provided for the top three cost items that may increase with heavy production load. 
- Cost documentation is included in the main `README.md`

## Principles
- Favor KISS over complexity, simplicity over comprehensibility
- Respect and adopt well-known cloud based architecture and integration patterns

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Writing your product spec
&lt;/h2&gt;

&lt;p&gt;Product specs or specifications are structured artifacts that formalize the development process. They provide a systematic approach to transform high-level ideas into detailed implementation plans with clear tracking and accountability.&lt;/p&gt;

&lt;p&gt;The workflow is illustrated below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg8sk2x4xwbhvpqyyzhnn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg8sk2x4xwbhvpqyyzhnn.png" width="584" height="1024"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the Kiro pane, click the &lt;code&gt;+&lt;/code&gt; button under  &lt;strong&gt;Specs&lt;/strong&gt;. Alternatively, choose  &lt;strong&gt;Spec&lt;/strong&gt;  from the chat pane.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvntspi3diwkberyqo5to.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvntspi3diwkberyqo5to.png" width="261" height="58"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Describe your project idea.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6l3tksc2v8eolt0c0dy1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6l3tksc2v8eolt0c0dy1.png" width="647" height="157"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A requirements Markdown file is created in a folder with the spec name weather-forecast-app.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcyqwep1j4xgxm4t0ebb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvcyqwep1j4xgxm4t0ebb.png" width="800" height="141"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 1: Define requirements
&lt;/h4&gt;

&lt;p&gt;The requirements.md file should define user stories with acceptance criterias in &lt;a href="https://alistairmavin.com/ears/" rel="noopener noreferrer"&gt;EARS&lt;/a&gt; notation, similar to common agile development practice. Define what we would like to achieve and which problems we propose to solve. HOW we plan to solve it will come afterwards in the Design phase.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://alistairmavin.com/ears/" rel="noopener noreferrer"&gt;EARS&lt;/a&gt;, which stands for Easy Approach to Requirements Syntax, is a method for writing clear and unambiguous requirements using a structured set of rules and keywords. Alistair Mavin and colleagues at Rolls-Royce PLC developed EARS whilst analysing the airworthiness regulations for a jet engine’s control system. The structured format makes it easy to understand what is expected, reducing misinterpretations. Clearer requirements lead to better test cases and easier verification of application functionality.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;WHEN [condition/event]
THE SYSTEM SHALL [expected behavior]&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WHEN [condition/event]
THE SYSTEM SHALL [expected behavior]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Starting writing your User Stories as acceptance criterias in EARS format. Remember, as described in the Introduction, the more complete context you provide, including what your team usually have in their minds and have learned by experience how you do you things in your company, be as specific as you can. Investing more time in producing well-crafted specifications can reduce time spent on modifications and troubleshooting.&lt;/p&gt;

&lt;p&gt;Remember, common principles and guidelines are defined as Agent Steering Context. Product Spec focuses on the functionality of the application. As the system grows you can create additional specifications and manage requirements in logical separation.&lt;/p&gt;

&lt;p&gt;This is how the &lt;a href="https://github.com/haakond/terraform-aws-weather-forecast/blob/main/.kiro/specs/weather-forecast-app/requirements.md" rel="noopener noreferrer"&gt;.kiro/specs/weather-forecast-app/requirements.md&lt;/a&gt; for the example weather forecast application looks like:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;# Requirements Document

## Introduction

This specification will create a weather forecast application, to be deployed with Terraform on AWS serverless infrastructure.

### Requirement 1

**User Story:** As an end user, I will access a web site which compares the weather forecast for tomorrow for the European cities Oslo (Norway), Paris (France), London (United Kingdom) and Barcelona (Spain).

#### Acceptance Criteria

1. WHEN an end user is accessing the service THEN the system SHALL display a simple web site with a fancy design for the weather forecast for the cities as described in the User Story
2. WHEN an end user is accessing the service THEN the system SHALL be snappy and respond fast
3. WHEN an end user is accessing the service on a mobile device THEN the design SHALL be optimized for display on a small screen
4. WHEN static content is served to end users THEN the system SHALL set Cache-Control headers with Max-Age of 900 seconds (15 minutes) to optimize performance and reduce server load

### Requirement 2

**User Story:** As a developer, my application requirements are as follows:

#### Acceptance Criteria

1. WHEN the weather-forecast-app is generated THEN the system SHALL provide a modern front-end application
2. WHEN the weather-forecast-app is generated THEN the system SHALL look up weather forecasts from https://api.met.no/weatherapi/locationforecast/2.0/documentation and cache the results for 1 hour
3. WHEN the weather-forecast-app is generated THEN the system SHALL respect the Terms of Service defined at https://developer.yr.no/doc/TermsOfService/
4. WHEN the weather-forecast-app is generated THEN the system SHALL be tested
5. WHEN the Lambda function successfully retrieves weather data from the backend API THEN the system SHALL set cache-control: max-age=60 on the HTTP response
6. WHEN the Lambda function fails to retrieve weather data from the backend API THEN the system SHALL set cache-control: max-age=0 on the HTTP response
7. WHEN the frontend displays weather data THEN the system SHALL show the Last updated timestamp from the lastUpdated field in the API response
8. WHEN weather data is cached in DynamoDB and the weather API does not provide timestamp information THEN the system SHALL use the DynamoDB cache timestamp as the lastUpdated value in the API response

### Requirement 3

**User Story:** As a developer, my cloud infrastructure requirements are as follows:

#### Acceptance Criteria

1. WHEN the infrastructure is generated THEN the codebase SHALL be organized as one, self-contained Terraform module
2. WHEN the infrastructure is generated THEN the system SHALL require basic unit tests and infrastructure-as-code validation to be successful
3. WHEN the infrastructure is deployed THEN the system SHALL create AWS resources with appropriate tags like Service:weather-forecast-app.
4. WHEN the infrastructure is deployed THEN the system SHALL package and deploy the weather-forecast-app code
5. WHEN the infrastructure is deployed THEN the system SHALL provide accessible endpoints for testing
6. WHEN the infrastructure is deployed THEN the system SHALL include required IAM roles and permissions
7. WHEN the infrastructure is deployed THEN the system SHALL output relevant URLs or connection information
8. WHEN the infrastructure is deployed THEN the system SHALL be configured for high availability
9. WHEN the CloudFront distribution is deployed THEN the system SHALL use price class 100 to optimize costs while covering Europe and the United States edge locations
10. WHEN the CloudFront distribution is deployed THEN the system SHALL allow only GET, HEAD, and OPTIONS HTTP methods for security and performance optimization
11. WHEN the CloudFront distribution is deployed THEN the system SHALL configure caching policies based on query parameters to optimize cache efficiency
12. WHEN the CloudFront distribution is deployed THEN the system SHALL set the default TTL to 900 seconds (15 minutes) to align with static content caching requirements&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Requirements Document

## Introduction

This specification will create a weather forecast application, to be deployed with Terraform on AWS serverless infrastructure.

### Requirement 1

**User Story:** As an end user, I will access a web site which compares the weather forecast for tomorrow for the European cities Oslo (Norway), Paris (France), London (United Kingdom) and Barcelona (Spain).

#### Acceptance Criteria

1. WHEN an end user is accessing the service THEN the system SHALL display a simple web site with a fancy design for the weather forecast for the cities as described in the User Story
2. WHEN an end user is accessing the service THEN the system SHALL be snappy and respond fast
3. WHEN an end user is accessing the service on a mobile device THEN the design SHALL be optimized for display on a small screen
4. WHEN static content is served to end users THEN the system SHALL set Cache-Control headers with Max-Age of 900 seconds (15 minutes) to optimize performance and reduce server load

### Requirement 2

**User Story:** As a developer, my application requirements are as follows:

#### Acceptance Criteria

1. WHEN the weather-forecast-app is generated THEN the system SHALL provide a modern front-end application
2. WHEN the weather-forecast-app is generated THEN the system SHALL look up weather forecasts from https://api.met.no/weatherapi/locationforecast/2.0/documentation and cache the results for 1 hour
3. WHEN the weather-forecast-app is generated THEN the system SHALL respect the Terms of Service defined at https://developer.yr.no/doc/TermsOfService/
4. WHEN the weather-forecast-app is generated THEN the system SHALL be tested
5. WHEN the Lambda function successfully retrieves weather data from the backend API THEN the system SHALL set cache-control: max-age=60 on the HTTP response
6. WHEN the Lambda function fails to retrieve weather data from the backend API THEN the system SHALL set cache-control: max-age=0 on the HTTP response
7. WHEN the frontend displays weather data THEN the system SHALL show the Last updated timestamp from the lastUpdated field in the API response
8. WHEN weather data is cached in DynamoDB and the weather API does not provide timestamp information THEN the system SHALL use the DynamoDB cache timestamp as the lastUpdated value in the API response

### Requirement 3

**User Story:** As a developer, my cloud infrastructure requirements are as follows:

#### Acceptance Criteria

1. WHEN the infrastructure is generated THEN the codebase SHALL be organized as one, self-contained Terraform module
2. WHEN the infrastructure is generated THEN the system SHALL require basic unit tests and infrastructure-as-code validation to be successful
3. WHEN the infrastructure is deployed THEN the system SHALL create AWS resources with appropriate tags like Service:weather-forecast-app.
4. WHEN the infrastructure is deployed THEN the system SHALL package and deploy the weather-forecast-app code
5. WHEN the infrastructure is deployed THEN the system SHALL provide accessible endpoints for testing
6. WHEN the infrastructure is deployed THEN the system SHALL include required IAM roles and permissions
7. WHEN the infrastructure is deployed THEN the system SHALL output relevant URLs or connection information
8. WHEN the infrastructure is deployed THEN the system SHALL be configured for high availability
9. WHEN the CloudFront distribution is deployed THEN the system SHALL use price class 100 to optimize costs while covering Europe and the United States edge locations
10. WHEN the CloudFront distribution is deployed THEN the system SHALL allow only GET, HEAD, and OPTIONS HTTP methods for security and performance optimization
11. WHEN the CloudFront distribution is deployed THEN the system SHALL configure caching policies based on query parameters to optimize cache efficiency
12. WHEN the CloudFront distribution is deployed THEN the system SHALL set the default TTL to 900 seconds (15 minutes) to align with static content caching requirements
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you prefer, you can write your requirements the way you are used to and then click &lt;em&gt;Refine&lt;/em&gt; to have Kiro help you format them in EARS format, but I would say it’s good team practice to align to the EARS format.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 2: Define design
&lt;/h4&gt;

&lt;p&gt;Switch to the design tab and click Refine to generate design specification based on the defined requirements, merged with the Agent Steering context configuration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd7i1zp91rya4350ppurc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd7i1zp91rya4350ppurc.png" width="800" height="153"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then Kiro starts working.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxumgk0hxkde1udq557f1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxumgk0hxkde1udq557f1.png" width="800" height="472"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Kiro has now populated design.md. When I take a closer look I don’t see anything addressing the Weather API Service Terms of Service rule: You must identify yourself (set custom HTTP User-Agent).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fshlyxznld20l0sxri1b1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fshlyxznld20l0sxri1b1.png" width="800" height="168"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can have Kiro help refine and add more requirements as you go along.&lt;/p&gt;

&lt;p&gt;Now the design looks good to me, so let’s move on to generate the implementation plan.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b3sl9qma2c7ddnpnr4x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b3sl9qma2c7ddnpnr4x.png" width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;# Weather Forecast App Design Document

## Overview

The weather forecast application is a serverless web application that displays tomorrow's weather forecast for four European cities: Oslo (Norway), Paris (France), London (United Kingdom), and Barcelona (Spain). The application will be deployed on AWS using Terraform infrastructure-as-code and will integrate with the Norwegian Meteorological Institute's weather API.

### Key Design Principles
- **Serverless-first architecture** for minimal operational overhead
- **Mobile-responsive design** for optimal user experience across devices
- **Fast response times** through efficient caching and CDN distribution
- **Well-architected AWS infrastructure** following security and reliability best practices

## Architecture

### High-Level Architecture
The application follows a serverless architecture pattern with the following components:

1. **Frontend** : Static web application hosted on S3 with CloudFront distribution
2. **Backend API** : AWS Lambda functions for weather data processing
3. **Data Layer** : DynamoDB for caching weather data and API rate limiting
4. **External Integration** : Norwegian Meteorological Institute API (api.met.no)

### Architecture Rationale
- **Static hosting with S3/CloudFront** : Provides fast global content delivery and handles traffic spikes efficiently
- **Lambda functions** : Serverless compute eliminates server management and scales automatically
- **DynamoDB** : NoSQL database perfect for caching weather data with TTL capabilities
- **API Gateway** : Provides managed API endpoints with built-in throttling and monitoring

## SNIP END&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Weather Forecast App Design Document

## Overview

The weather forecast application is a serverless web application that displays tomorrow's weather forecast for four European cities: Oslo (Norway), Paris (France), London (United Kingdom), and Barcelona (Spain). The application will be deployed on AWS using Terraform infrastructure-as-code and will integrate with the Norwegian Meteorological Institute's weather API.

### Key Design Principles
- **Serverless-first architecture** for minimal operational overhead
- **Mobile-responsive design** for optimal user experience across devices
- **Fast response times** through efficient caching and CDN distribution
- **Well-architected AWS infrastructure** following security and reliability best practices

## Architecture

### High-Level Architecture
The application follows a serverless architecture pattern with the following components:

1. **Frontend** : Static web application hosted on S3 with CloudFront distribution
2. **Backend API** : AWS Lambda functions for weather data processing
3. **Data Layer** : DynamoDB for caching weather data and API rate limiting
4. **External Integration** : Norwegian Meteorological Institute API (api.met.no)

### Architecture Rationale
- **Static hosting with S3/CloudFront** : Provides fast global content delivery and handles traffic spikes efficiently
- **Lambda functions** : Serverless compute eliminates server management and scales automatically
- **DynamoDB** : NoSQL database perfect for caching weather data with TTL capabilities
- **API Gateway** : Provides managed API endpoints with built-in throttling and monitoring

## SNIP END
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The design turned out to be grow into more than 300 lines, so I’m just including a snippet here. You can review the complete file here: &lt;a href="https://github.com/haakond/terraform-aws-weather-forecast/blob/main/.kiro/specs/weather-forecast-app/design.md" rel="noopener noreferrer"&gt;.kiro/specs/weather-forecast-app/design.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This looks good to me, let’s proceed to generate the implementation plan.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 3: Implement
&lt;/h4&gt;

&lt;p&gt;Kiro has now generated a task list for implementation based on the previous steps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7cgjflshii708d1rq3a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7cgjflshii708d1rq3a.png" width="800" height="491"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/haakond/terraform-aws-weather-forecast/blob/main/.kiro/specs/weather-forecast-app/tasks.md" rel="noopener noreferrer"&gt;.kiro/specs/weather-forecast-app/tasks.md&lt;/a&gt; now looks like this:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;# Implementation Plan

- [] 1. Set up project structure and configuration
  - Create Terraform module directory structure following best practices
  - Set up Python virtual environment with pyenv for application development
  - Configure pre-commit hooks for Terraform validation and formatting
  - Create basic project documentation structure with docs/ directory
  - _Requirements: 3.1, 3.2_

- [] 2. Implement simplified Lambda weather service
  - [] 2.1 Create embedded weather service in Lambda handler
    - Implement weather data fetching directly in lambda_handler.py using urllib
    - Define city configuration with coordinates for Oslo, Paris, London, Barcelona
    - Create weather data processing and transformation logic embedded in handler
    - Implement proper User-Agent header with configurable company website
    - Add rate limiting with simple delays between API calls
    - _Requirements: 2.2, 2.3, 1.1_

  - [] 2.2 Implement weather API integration and processing
    - Create fetch_weather_data function for met.no API calls
    - Implement extract_tomorrow_forecast for parsing API responses
    - Add weather condition mapping and error handling
    - Create process_city_weather for individual city processing
    - Implement get_weather_summary for all cities with delay between calls
    - _Requirements: 2.2, 1.2_

- [] 3. Build Lambda function infrastructure
  - [] 3.1 Create simplified Lambda function handler
    - Implement main Lambda handler with embedded weather service
    - Add environment variable configuration for company website
    - Implement proper error handling and logging with standardized responses
    - Create /health endpoint for monitoring with environment information
    - Add CORS support and OPTIONS request handling
    - _Requirements: 2.1, 2.4, 3.6_

  - [] 3.2 Add DynamoDB caching to simplified Lambda handler
    - Implement DynamoDB caching directly in the lambda_handler.py file
    - Add cache check before API calls and cache storage after successful API responses
    - Implement 1-hour TTL (3600 seconds) for cached weather data
    - Add error handling for DynamoDB operations with fallback to API calls
    - Use boto3 client for DynamoDB operations embedded in the handler
    - _Requirements: 2.2, 1.2, 3.6_

- [] 4. Update Terraform infrastructure for simplified approach
  - [] 4.1 Maintain DynamoDB table configuration for caching
    - Keep existing DynamoDB table from Terraform backend module
    - Maintain DynamoDB-related IAM permissions for Lambda role
    - Ensure DynamoDB table name is passed to Lambda via environment variable
    - Keep TTL configuration for 1-hour cache expiration
    - Maintain existing tests for DynamoDB validation
    - _Requirements: 3.1, 3.6, 3.8_

  - [] 4.2 Update Lambda function Terraform module for simplified deployment
    - Update Terraform configuration for simplified Lambda function
    - Maintain DynamoDB environment variables (table name) and permissions
    - Keep COMPANY_WEBSITE environment variable configuration
    - Maintain X-Ray tracing and CloudWatch logging
    - Keep IAM role with DynamoDB permissions for caching
    - _Requirements: 3.1, 3.4, 3.6, 3.8_

  - [] 4.3 Maintain API Gateway configuration
    - Keep existing API Gateway REST API configuration
    - Maintain CORS settings and rate limiting
    - Keep Lambda integration with proper error handling
    - Maintain CloudWatch logging configuration
    - Keep existing tests for the API Gateway setup
    - _Requirements: 3.1, 3.5, 3.6_

- [] 5. Create frontend application
  - [] 5.1 Build responsive weather display components
    - Create React components for weather card display
    - Implement responsive grid layout for four cities
    - Add loading states and error handling UI
    - Ensure mobile-optimized design with proper breakpoints
    - _Requirements: 1.1, 1.2, 1.3, 2.1_

  - [] 5.2 Implement API integration and state management
    - Create API client for backend weather service
    - Implement data fetching with error handling and retries
    - Add browser-side caching strategy respecting 1-hour backend cache
    - Create loading and error state management
    - _Requirements: 1.2, 2.1, 2.2_

  - [] 5.3 Add weather icons and styling
    - Implement weather condition icon mapping
    - Create CSS styling for responsive design
    - Add animations and transitions for better UX
    - Ensure accessibility compliance (WCAG)
    - _Requirements: 1.1, 1.3, 2.1_

  - [] 5.4 Optimize frontend build for caching
    - Configure build process to generate static assets optimized for 15-minute caching
    - Ensure proper file naming and versioning for cache busting when needed
    - Validate that all static assets (HTML, CSS, JS, images) are properly configured
    - _Requirements: 1.2, 1.4_

- [] 6. Configure static hosting infrastructure
  - [] 6.1 Create S3 bucket for static hosting
    - Implement Terraform module for S3 bucket configuration
    - Configure bucket policies for static website hosting
    - Set up versioning and lifecycle policies
    - Add proper IAM permissions for deployment
    - Write basic tests for the S3 configuration
    - _Requirements: 3.1, 3.4, 3.6_

  - [] 6.2 Set up CloudFront distribution
    - Create Terraform module for CloudFront CDN
    - Configure cache behaviors and TTL settings
    - Set up origin failover for high availability
    - Add security headers and HTTPS redirection
    - Write basic tests for the Cloudfront configuration
    - _Requirements: 1.2, 3.1, 3.8_

  - [] 6.4 Configure CloudFront price class and optimization settings
    - Update CloudFront distribution to use price class 100 (PriceClass_100)
    - Configure allowed HTTP methods to GET, HEAD, and OPTIONS only
    - Set up caching policy configuration based on query parameters
    - Configure default TTL to 900 seconds (15 minutes)
    - Ensure coverage includes Europe and United States edge locations
    - Validate cost optimization while maintaining performance for target regions
    - Update Terraform configuration with appropriate price_class, allowed_methods, and caching parameters
    - Test CloudFront distribution functionality with new configuration
    - _Requirements: 3.9, 3.10, 3.11, 3.12_

  - [] 6.3 Configure Cache-Control headers for static content
    - Configure S3 bucket metadata to set Cache-Control: max-age=900 for all static assets
    - Update CloudFront cache behaviors to respect and forward Cache-Control headers
    - Ensure consistent 15-minute caching for HTML, CSS, JavaScript, and image files
    - _Requirements: 1.2, 1.4_

- [] 7. Implement monitoring and observability
  - [] 7.1 Create simple and intuitive CloudWatch dashboard and alarms
    - Implement Terraform module for CloudWatch dashboard
    - Configure the most important alarms for Lambda errors, API Gateway 5xx, and DynamoDB throttling
    - Set up custom metrics for weather API success rates
    - Add log retention policies (180 days)
    - _Requirements: 3.6, 3.7_

  - [] 7.2 Set up AWS Budget and cost monitoring
    - Create Terraform module for AWS Budget with Service tag filter
    - Configure budget alerts for cost thresholds
    - Implement simple and intuitive cost monitoring Cloudwatch dashboard
    - _Requirements: 3.3, 3.7_

- [] 8. Create deployment and testing automation
  - [] 8.1 Implement Terraform module packaging
    - Create main Terraform module with all sub-modules
    - Configure variable definitions and outputs
    - Add module documentation with terraform-docs
    - Create examples/ directory with usage examples
    - _Requirements: 3.1, 3.2, 3.7_

  - [] 8.2 Add basic integration and end-to-end tests
    - Create integration tests for complete weather data flow
    - Implement end-to-end tests for user journey with CloudWatch synthetics
    - Create basic infrastructure deployment tests
    - Write basic test automation scripts with cleanup
    - _Requirements: 2.4, 3.2_

  - [] 8.3 Add cache header validation tests
    - Create automated tests to verify Cache-Control headers are properly set
    - Test that static assets return max-age=900 in response headers
    - Validate cache behavior across different asset types (HTML, CSS, JS, images)
     - _Requirements: 1.4, 2.4_

  - [] 8.4 Fix CI/CD deployment path issues
    - Resolve frontend build path problems in CI/CD environments where working directory structure differs
    - Update Terraform frontend module to handle different working directory structures and missing directories
    - Add proper error handling and path validation for frontend build process
    - Ensure frontend directory and package.json are found correctly in CI/CD pipelines
    - Test build process works in both local development and CI/CD environments
    - _Requirements: 3.1, 3.4_

  - [] 8.5 Update unit tests for simplified Lambda implementation
    - Update existing unit tests to work with the simplified embedded Lambda handler
    - Remove tests for separate weather service modules (api_client, cache, processor, etc.)
    - Create focused tests for the main Lambda handler functions
    - Test weather data fetching, processing, response formatting, and DynamoDB caching
    - Ensure tests cover error handling, cache hits/misses, and edge cases
    - _Requirements: 2.4, 3.2_

  - [] 8.6 Add frontend error loop prevention safeguards
    - Implement circuit breaker pattern in useWeatherData hook to prevent infinite retry loops
    - Add exponential backoff with maximum delay caps for failed requests
    - Implement request rate limiting to prevent rapid successive API calls on errors
    - Add error threshold detection to disable auto-retry after consecutive failures
    - Create user-friendly error states that prevent automatic retry loops
    - _Requirements: 1.2, 2.1_

  - [] 8.7 Configure reasonable Lambda concurrency limits
    - Set Lambda reserved concurrency to 5 concurrent executions (reasonable for weather API)
    - Update backend module variables to reflect appropriate concurrency limits
    - Add documentation explaining concurrency limits and cost implications
    - Ensure concurrency limits prevent runaway costs while maintaining service availability
    - Test concurrency limits under load to ensure proper throttling behavior
    - _Requirements: 3.6, 3.8_

  - [] 8.8 Implement dynamic cache-control headers in Lambda function
    - Update Lambda handler to set cache-control: max-age=60 for successful weather API responses
    - Set cache-control: max-age=0 for failed weather API responses or error conditions
    - Ensure cache-control headers are properly included in HTTP response headers
    - Test cache-control behavior for both success and failure scenarios
    - _Requirements: 2.5, 2.6_

  - [] 8.9 Implement lastUpdated timestamp handling in Lambda function
    - Update Lambda handler to include lastUpdated timestamp in all API responses
    - Use weather API timestamp when available in the met.no API response
    - Fall back to DynamoDB cache timestamp when weather API timestamp is not provided
    - Ensure timestamp is in ISO 8601 format for consistent frontend display
    - Test timestamp handling for both fresh API calls and cached responses
    - _Requirements: 2.7, 2.8_

  - [] 8.10 Update frontend to display lastUpdated timestamp
    - Modify weather display components to show the lastUpdated timestamp from API responses
    - Format timestamp for user-friendly display (e.g., "Last updated: 2 minutes ago")
    - Handle cases where lastUpdated is null or missing
    - Ensure timestamp display is responsive and accessible
    - _Requirements: 2.7_

- [] 9. Generate documentation and cost analysis
  - [] 9.1 Create architecture diagrams
    - Generate AWS architecture diagram using MCP diagram server
    - Create sequence diagrams for weather data flow
    - Add deployment flow diagrams
    - Include diagrams in main README.md
    - _Requirements: 3.7_

  - [] 9.2 Perform cost analysis and optimization
    - Use AWS Labs Pricing MCP server for cost calculations
    - Compare costs across eu-west-1, eu-central-1, eu-north-1 regions
    - Create cost projections for staging and production environments
    - Document top three cost optimization opportunities
    - Include cost analysis in main README.md
    - _Requirements: 3.7_

- [] 10. Finalize project documentation
  - Create crisp and clear README.md with TL;DR section
  - Add executive summary for project stakeholders
  - Create basic deployment guide and troubleshooting documentation
  - Write operational runbooks for maintenance
  - Add basic examples for CI/CD integration and how to configure relevant variables
  - _Requirem&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Implementation Plan

- [] 1. Set up project structure and configuration
  - Create Terraform module directory structure following best practices
  - Set up Python virtual environment with pyenv for application development
  - Configure pre-commit hooks for Terraform validation and formatting
  - Create basic project documentation structure with docs/ directory
  - _Requirements: 3.1, 3.2_

- [] 2. Implement simplified Lambda weather service
  - [] 2.1 Create embedded weather service in Lambda handler
    - Implement weather data fetching directly in lambda_handler.py using urllib
    - Define city configuration with coordinates for Oslo, Paris, London, Barcelona
    - Create weather data processing and transformation logic embedded in handler
    - Implement proper User-Agent header with configurable company website
    - Add rate limiting with simple delays between API calls
    - _Requirements: 2.2, 2.3, 1.1_

  - [] 2.2 Implement weather API integration and processing
    - Create fetch_weather_data function for met.no API calls
    - Implement extract_tomorrow_forecast for parsing API responses
    - Add weather condition mapping and error handling
    - Create process_city_weather for individual city processing
    - Implement get_weather_summary for all cities with delay between calls
    - _Requirements: 2.2, 1.2_

- [] 3. Build Lambda function infrastructure
  - [] 3.1 Create simplified Lambda function handler
    - Implement main Lambda handler with embedded weather service
    - Add environment variable configuration for company website
    - Implement proper error handling and logging with standardized responses
    - Create /health endpoint for monitoring with environment information
    - Add CORS support and OPTIONS request handling
    - _Requirements: 2.1, 2.4, 3.6_

  - [] 3.2 Add DynamoDB caching to simplified Lambda handler
    - Implement DynamoDB caching directly in the lambda_handler.py file
    - Add cache check before API calls and cache storage after successful API responses
    - Implement 1-hour TTL (3600 seconds) for cached weather data
    - Add error handling for DynamoDB operations with fallback to API calls
    - Use boto3 client for DynamoDB operations embedded in the handler
    - _Requirements: 2.2, 1.2, 3.6_

- [] 4. Update Terraform infrastructure for simplified approach
  - [] 4.1 Maintain DynamoDB table configuration for caching
    - Keep existing DynamoDB table from Terraform backend module
    - Maintain DynamoDB-related IAM permissions for Lambda role
    - Ensure DynamoDB table name is passed to Lambda via environment variable
    - Keep TTL configuration for 1-hour cache expiration
    - Maintain existing tests for DynamoDB validation
    - _Requirements: 3.1, 3.6, 3.8_

  - [] 4.2 Update Lambda function Terraform module for simplified deployment
    - Update Terraform configuration for simplified Lambda function
    - Maintain DynamoDB environment variables (table name) and permissions
    - Keep COMPANY_WEBSITE environment variable configuration
    - Maintain X-Ray tracing and CloudWatch logging
    - Keep IAM role with DynamoDB permissions for caching
    - _Requirements: 3.1, 3.4, 3.6, 3.8_

  - [] 4.3 Maintain API Gateway configuration
    - Keep existing API Gateway REST API configuration
    - Maintain CORS settings and rate limiting
    - Keep Lambda integration with proper error handling
    - Maintain CloudWatch logging configuration
    - Keep existing tests for the API Gateway setup
    - _Requirements: 3.1, 3.5, 3.6_

- [] 5. Create frontend application
  - [] 5.1 Build responsive weather display components
    - Create React components for weather card display
    - Implement responsive grid layout for four cities
    - Add loading states and error handling UI
    - Ensure mobile-optimized design with proper breakpoints
    - _Requirements: 1.1, 1.2, 1.3, 2.1_

  - [] 5.2 Implement API integration and state management
    - Create API client for backend weather service
    - Implement data fetching with error handling and retries
    - Add browser-side caching strategy respecting 1-hour backend cache
    - Create loading and error state management
    - _Requirements: 1.2, 2.1, 2.2_

  - [] 5.3 Add weather icons and styling
    - Implement weather condition icon mapping
    - Create CSS styling for responsive design
    - Add animations and transitions for better UX
    - Ensure accessibility compliance (WCAG)
    - _Requirements: 1.1, 1.3, 2.1_

  - [] 5.4 Optimize frontend build for caching
    - Configure build process to generate static assets optimized for 15-minute caching
    - Ensure proper file naming and versioning for cache busting when needed
    - Validate that all static assets (HTML, CSS, JS, images) are properly configured
    - _Requirements: 1.2, 1.4_

- [] 6. Configure static hosting infrastructure
  - [] 6.1 Create S3 bucket for static hosting
    - Implement Terraform module for S3 bucket configuration
    - Configure bucket policies for static website hosting
    - Set up versioning and lifecycle policies
    - Add proper IAM permissions for deployment
    - Write basic tests for the S3 configuration
    - _Requirements: 3.1, 3.4, 3.6_

  - [] 6.2 Set up CloudFront distribution
    - Create Terraform module for CloudFront CDN
    - Configure cache behaviors and TTL settings
    - Set up origin failover for high availability
    - Add security headers and HTTPS redirection
    - Write basic tests for the Cloudfront configuration
    - _Requirements: 1.2, 3.1, 3.8_

  - [] 6.4 Configure CloudFront price class and optimization settings
    - Update CloudFront distribution to use price class 100 (PriceClass_100)
    - Configure allowed HTTP methods to GET, HEAD, and OPTIONS only
    - Set up caching policy configuration based on query parameters
    - Configure default TTL to 900 seconds (15 minutes)
    - Ensure coverage includes Europe and United States edge locations
    - Validate cost optimization while maintaining performance for target regions
    - Update Terraform configuration with appropriate price_class, allowed_methods, and caching parameters
    - Test CloudFront distribution functionality with new configuration
    - _Requirements: 3.9, 3.10, 3.11, 3.12_

  - [] 6.3 Configure Cache-Control headers for static content
    - Configure S3 bucket metadata to set Cache-Control: max-age=900 for all static assets
    - Update CloudFront cache behaviors to respect and forward Cache-Control headers
    - Ensure consistent 15-minute caching for HTML, CSS, JavaScript, and image files
    - _Requirements: 1.2, 1.4_

- [] 7. Implement monitoring and observability
  - [] 7.1 Create simple and intuitive CloudWatch dashboard and alarms
    - Implement Terraform module for CloudWatch dashboard
    - Configure the most important alarms for Lambda errors, API Gateway 5xx, and DynamoDB throttling
    - Set up custom metrics for weather API success rates
    - Add log retention policies (180 days)
    - _Requirements: 3.6, 3.7_

  - [] 7.2 Set up AWS Budget and cost monitoring
    - Create Terraform module for AWS Budget with Service tag filter
    - Configure budget alerts for cost thresholds
    - Implement simple and intuitive cost monitoring Cloudwatch dashboard
    - _Requirements: 3.3, 3.7_

- [] 8. Create deployment and testing automation
  - [] 8.1 Implement Terraform module packaging
    - Create main Terraform module with all sub-modules
    - Configure variable definitions and outputs
    - Add module documentation with terraform-docs
    - Create examples/ directory with usage examples
    - _Requirements: 3.1, 3.2, 3.7_

  - [] 8.2 Add basic integration and end-to-end tests
    - Create integration tests for complete weather data flow
    - Implement end-to-end tests for user journey with CloudWatch synthetics
    - Create basic infrastructure deployment tests
    - Write basic test automation scripts with cleanup
    - _Requirements: 2.4, 3.2_

  - [] 8.3 Add cache header validation tests
    - Create automated tests to verify Cache-Control headers are properly set
    - Test that static assets return max-age=900 in response headers
    - Validate cache behavior across different asset types (HTML, CSS, JS, images)
     - _Requirements: 1.4, 2.4_

  - [] 8.4 Fix CI/CD deployment path issues
    - Resolve frontend build path problems in CI/CD environments where working directory structure differs
    - Update Terraform frontend module to handle different working directory structures and missing directories
    - Add proper error handling and path validation for frontend build process
    - Ensure frontend directory and package.json are found correctly in CI/CD pipelines
    - Test build process works in both local development and CI/CD environments
    - _Requirements: 3.1, 3.4_

  - [] 8.5 Update unit tests for simplified Lambda implementation
    - Update existing unit tests to work with the simplified embedded Lambda handler
    - Remove tests for separate weather service modules (api_client, cache, processor, etc.)
    - Create focused tests for the main Lambda handler functions
    - Test weather data fetching, processing, response formatting, and DynamoDB caching
    - Ensure tests cover error handling, cache hits/misses, and edge cases
    - _Requirements: 2.4, 3.2_

  - [] 8.6 Add frontend error loop prevention safeguards
    - Implement circuit breaker pattern in useWeatherData hook to prevent infinite retry loops
    - Add exponential backoff with maximum delay caps for failed requests
    - Implement request rate limiting to prevent rapid successive API calls on errors
    - Add error threshold detection to disable auto-retry after consecutive failures
    - Create user-friendly error states that prevent automatic retry loops
    - _Requirements: 1.2, 2.1_

  - [] 8.7 Configure reasonable Lambda concurrency limits
    - Set Lambda reserved concurrency to 5 concurrent executions (reasonable for weather API)
    - Update backend module variables to reflect appropriate concurrency limits
    - Add documentation explaining concurrency limits and cost implications
    - Ensure concurrency limits prevent runaway costs while maintaining service availability
    - Test concurrency limits under load to ensure proper throttling behavior
    - _Requirements: 3.6, 3.8_

  - [] 8.8 Implement dynamic cache-control headers in Lambda function
    - Update Lambda handler to set cache-control: max-age=60 for successful weather API responses
    - Set cache-control: max-age=0 for failed weather API responses or error conditions
    - Ensure cache-control headers are properly included in HTTP response headers
    - Test cache-control behavior for both success and failure scenarios
    - _Requirements: 2.5, 2.6_

  - [] 8.9 Implement lastUpdated timestamp handling in Lambda function
    - Update Lambda handler to include lastUpdated timestamp in all API responses
    - Use weather API timestamp when available in the met.no API response
    - Fall back to DynamoDB cache timestamp when weather API timestamp is not provided
    - Ensure timestamp is in ISO 8601 format for consistent frontend display
    - Test timestamp handling for both fresh API calls and cached responses
    - _Requirements: 2.7, 2.8_

  - [] 8.10 Update frontend to display lastUpdated timestamp
    - Modify weather display components to show the lastUpdated timestamp from API responses
    - Format timestamp for user-friendly display (e.g., "Last updated: 2 minutes ago")
    - Handle cases where lastUpdated is null or missing
    - Ensure timestamp display is responsive and accessible
    - _Requirements: 2.7_

- [] 9. Generate documentation and cost analysis
  - [] 9.1 Create architecture diagrams
    - Generate AWS architecture diagram using MCP diagram server
    - Create sequence diagrams for weather data flow
    - Add deployment flow diagrams
    - Include diagrams in main README.md
    - _Requirements: 3.7_

  - [] 9.2 Perform cost analysis and optimization
    - Use AWS Labs Pricing MCP server for cost calculations
    - Compare costs across eu-west-1, eu-central-1, eu-north-1 regions
    - Create cost projections for staging and production environments
    - Document top three cost optimization opportunities
    - Include cost analysis in main README.md
    - _Requirements: 3.7_

- [] 10. Finalize project documentation
  - Create crisp and clear README.md with TL;DR section
  - Add executive summary for project stakeholders
  - Create basic deployment guide and troubleshooting documentation
  - Write operational runbooks for maintenance
  - Add basic examples for CI/CD integration and how to configure relevant variables
  - _Requirem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the IDE you will see an option to trigger to start tasks:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkd3y2917fuu9v12sxtu5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkd3y2917fuu9v12sxtu5.png" width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can either trigger them one by one or ask Kiro in the chat to get started.&lt;/p&gt;

&lt;p&gt;Let’s start the first task.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwyj4qwp5srqb1qce6v1j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwyj4qwp5srqb1qce6v1j.png" width="596" height="157"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9vw5vf7jx51plvdthapr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9vw5vf7jx51plvdthapr.png" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can see that Kiro is setting up the structure according to what’s stated on the external website &lt;a href="http://terraform-best-practices.com" rel="noopener noreferrer"&gt;terraform-best-practices.com&lt;/a&gt;, as requested. Nice.&lt;/p&gt;

&lt;p&gt;Like with Amazon Q Developer, you can approve or let Kiro trust specific tools and commands:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffx69mhg425ajrf0q0acy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffx69mhg425ajrf0q0acy.png" width="800" height="318"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After a short while the first task is complete!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjm9crsv1mftobi5fqe33.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjm9crsv1mftobi5fqe33.png" width="800" height="602"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I did note that Terraform AWS provider version 5 was installed. The latest one is major version 6.7.0, so I updated the Tech Steering specification and asked Kiro to refresh.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1866g5zkgurt0r3vsg9a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1866g5zkgurt0r3vsg9a.png" width="800" height="259"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Kiro refreshed the context guidelines, found what to update, performed the changes and ran checks and pre-commit to verify it’s working as expected.&lt;/p&gt;

&lt;p&gt;Then we move on to Task 2: Implement core Python service, and so on.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgz05e3eddyravi0lawny.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgz05e3eddyravi0lawny.png" width="800" height="321"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is how the Summary of implementation looks like for task 7.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Favbgn0jxpx7f2mtx4jho.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Favbgn0jxpx7f2mtx4jho.png" width="800" height="642"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see the main difference between the traditional vibe/CLI approach is that the spec-driven workflow keeps the steering and requirements up to date, to persist the context. This makes the process a lot more predictable, and possible to collaborate on in a team.&lt;/p&gt;

&lt;p&gt;Keeping the specifications version controlled along with the application codebase makes it easy to track changes as changes are committed.&lt;/p&gt;

&lt;p&gt;Depending on your team development workflow, each feature could be organized as a Spec, containing relevant user stories.&lt;/p&gt;

&lt;p&gt;Kiro generated a comprehensive local testing suite. It turned out to be more complex than I think is necessary, with some tests being flaky. I asked Kiro to focus on testing the core functionality and remove brittle and complex tests. A reflection here is that I did not specify the desired test approach in detail in my steering context.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9svkb8prnonms4tmctgd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9svkb8prnonms4tmctgd.png" width="800" height="532"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here are the diagrams the AWS Diagrams MCP server helped create:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frnpqla0bzjo6soyyxrdz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frnpqla0bzjo6soyyxrdz.png" width="800" height="731"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xh1l36p8olch65nbui0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xh1l36p8olch65nbui0.png" width="290" height="1024"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying the final solution
&lt;/h2&gt;

&lt;p&gt;I included the module definition in my existing Github Actions CI/CD Terraform codebase:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;module "weather_forecast_app" {
  source = "git::https://github.com/haakond/terraform-aws-weather-forecast.git?ref=COMMIT-SHA"
  project_name = "weather-forecast-app"
  environment = "prod"
  aws_region = "eu-west-1"
  weather_service_identification_domain = "youramazingwebsite.com"
}&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;module "weather_forecast_app" {
  source = "git::https://github.com/haakond/terraform-aws-weather-forecast.git?ref=COMMIT-SHA"
  project_name = "weather-forecast-app"
  environment = "prod"
  aws_region = "eu-west-1"
  weather_service_identification_domain = "youramazingwebsite.com"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I experienced a few Terraform errors that Kiro wasn’t able to catch before terraform plan and apply. Issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CloudWatch Logs groups defined with the same name in two sub-modules&lt;/li&gt;
&lt;li&gt;Missing API Gateway Account configuration for CloudWatch Logs&lt;/li&gt;
&lt;li&gt;Missing deploy process for the React frontend to Amazon S3

&lt;ul&gt;
&lt;li&gt;Since this is fairly static app I decided to keep it along with the infrastructure code, for simplicity.&lt;/li&gt;
&lt;li&gt;For production frontend applications I would set up a dedicated CI/CD pipeline to deploy only the frontend codebase. &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;An overly complex Reach frontend application, which I asked Kiro to simplify.&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;With some additional prompting assistance from Kiro I was able to resolve the issues and end up with a fully working deployment!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fawkmuazmrkrljtb9s7ch.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fawkmuazmrkrljtb9s7ch.png" width="800" height="700"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fncxh8z9zi5sqyl44q21p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fncxh8z9zi5sqyl44q21p.png" width="800" height="580"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;End result:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2q2ykfggwix67y8t6ipt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2q2ykfggwix67y8t6ipt.png" width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Workflow for adding a new feature
&lt;/h3&gt;

&lt;p&gt;Here’s one possible suggested approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;git clone into a new feature branch&lt;/li&gt;
&lt;li&gt;Specify

&lt;ul&gt;
&lt;li&gt;For a major new feature: create a new Specification (requirements, design, tasks)&lt;/li&gt;
&lt;li&gt;For a minor improvement: Incorporate into an existing Specification&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Implement requirements&lt;/li&gt;

&lt;li&gt;Create pull request&lt;/li&gt;

&lt;li&gt;Review and merge to main&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Learnings and key takeaways
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvtqlaxy492og335cwwje.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvtqlaxy492og335cwwje.png" width="538" height="34"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;If you do manual changes outside of Kiro, the specs (Requirements, Design, Tasks) will deviate and confuse Kiro. Stick to the Kiro workflow, and ask Kiro to refresh what you did. Kiro will backport into the specs.

&lt;ul&gt;
&lt;li&gt;If during the preview period you keep hitting Kiro’s usage limit, a workaround can be to get Amazon Q Developer help you when troubleshooting, to save interaction tokens. Just make sure that you tell Kiro which areas has changed to get the specs up-to-date.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Separate product feature specifications from common steering context.&lt;/li&gt;
&lt;li&gt;Kiro has a tendency to add more comprehensive and complex testing procedures and documentation than a human would appreciate. Be explicit about keeping things simple and focus on the core functionality.&lt;/li&gt;
&lt;li&gt;To ensure the suggested tests are relevant and follows your company practice, add a detailed Agent Steering document for testing.&lt;/li&gt;
&lt;li&gt;Organize specs by feature, to be able to work independently without conflicts or affecting other areas. This can also reduce the blast radius in case of unexpected changes, plus it reduces the context size for the agent. &lt;/li&gt;
&lt;li&gt;Keep specs and user stories in version control along with your application. If you perform traditional manual changes, give Kiro a hint to have the design and requirements updated accordingly.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Reflections on context
&lt;/h3&gt;

&lt;p&gt;Traditionally, the environment specific context are a company’s tech stack, policies and guidelines combined with the experience of experienced software engineers. This is information known and acquired in your setting and is normally not included in (Jira) User Stories. However, coding agents by default do not posess this context information. The closest thing may be the possibility to configure Amazon Q Developer with custom repositories, so that it can learn about company specific coding standards, libraries etc.&lt;/p&gt;

&lt;p&gt;Spec driven development with Kiro now forces this context information to be defined as Agent Steering resources. Teams can organize workshops to document their guiding tenets, principles, organized in Git repo and iterated as the information evolves. Perhaps your team have something similar documented in a company wiki already?&lt;/p&gt;

&lt;p&gt;I think we need to help AI coding companions build the same mental framework we’d give to a human colleague during onboarding and code review. Kiro solves this by checking in your specifications to Git. Consider creating a Kiro app bootstrap repository or include common steering as a Git submodule, package reference or similar.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kiro pricing
&lt;/h2&gt;

&lt;p&gt;When Kiro becomes Generally Available, there will be different tiers available to match your level of usage.&lt;/p&gt;

&lt;p&gt;Vibe Requests cover any agentic operation in Kiro that does not involve execution of a Spec Task.&lt;/p&gt;

&lt;p&gt;You start with Vibe requests to create requirements, design documents and tasks.&lt;/p&gt;

&lt;p&gt;One Vibe Request typically equals one message or prompt, while one Spec Requests equals executing a single Spec Task.&lt;/p&gt;

&lt;p&gt;For more information see &lt;a href="https://kiro.dev/blog/understanding-kiro-pricing-specs-vibes-usage-tracking/" rel="noopener noreferrer"&gt;https://kiro.dev/blog/understanding-kiro-pricing-specs-vibes-usage-tracking/&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;My personal experience is that I really appreciate the spec-driven development process. Kiro is a game-changer which can significantly boost what builders are able to produce. There are no more valid excuses for avoiding sufficient test coverage or struggling with even a nice looking frontend app!&lt;/p&gt;

&lt;p&gt;Kiro is not just a new tool, it is a new workflow; AI-assisted, spec-driven development which can incorporate mature engineering practices, and my initial evaluation leaves me thinking that AI is starting to grow up and become more professional. As a bonus, Kiro forces you to write better specifications, which can make it easier to establish a common alignment in a team, and while onboarding new team members.&lt;/p&gt;

&lt;p&gt;This approach is a giant leap forward compared to coding assistants pre H2 2025. &lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/context-project-rules.html" rel="noopener noreferrer"&gt;Project Rules&lt;/a&gt; and &lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/customizations.html" rel="noopener noreferrer"&gt;Customizations&lt;/a&gt; in &lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/what-is.html" rel="noopener noreferrer"&gt;Amazon Q Developer&lt;/a&gt; were a step in the right direction, but Kiro brings a sought-after structure and consistency that in my opinion not only is nice, but necessary, in a professional context. Yes, there are still some bugs and quirks (Kiro is at time of writing still in limited preview) but I am optimistic that this technology and the underlying Large Language Models will mature and produce results of increasingly higher quality and predictability over the coming months. Earlier, my experience was that code assistant suggestions could get me around ~70% up to speed, with ~30% traditional authoring for preciseness. Kiro boosts this to maybe ~85%+. I would claim that return on investment on the Kiro license can be achieved pretty fast.&lt;/p&gt;

&lt;p&gt;I would still prefer a knowledgeable human in the loop reviewing Pull Requests and making the final decision for changes, that is until Agentic AI has matured a bit more. It will be exciting to see how this field evolves during the next couple of years.&lt;/p&gt;

&lt;p&gt;I encourage you to try out spec-driven development with your colleagues and warmly welcome Kiro as your new team member.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://kiro.dev/" rel="noopener noreferrer"&gt;https://kiro.dev/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kiro.dev/blog/introducing-kiro/" rel="noopener noreferrer"&gt;https://kiro.dev/blog/introducing-kiro/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kiro.dev/blog/kiro-and-the-future-of-software-development/" rel="noopener noreferrer"&gt;https://kiro.dev/blog/kiro-and-the-future-of-software-development/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/claude/sonnet" rel="noopener noreferrer"&gt;Anthropic Claude Sonnet&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://builder.aws.com/content/30HonEOE2KzaYCYlyDRdbE25yxz/ai-driven-development-life-cycle-reimagining-software-engineering" rel="noopener noreferrer"&gt;AWS Builder Center blog: AI-Driven Development Life Cycle: Reimagining Software Engineering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Header image courtesy of &lt;a href="https://aws.amazon.com/ai/generative-ai/nova/" rel="noopener noreferrer"&gt;Amazon Nova Canvas 1.0&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2025/08/11/how-to-use-kiro-for-ai-assisted-spec-driven-development/" rel="noopener noreferrer"&gt;How to use Kiro for AI assisted spec-driven development&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com/" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
    </item>
    <item>
      <title>Extensive reporting of Well-Architected Maturity</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Sun, 20 Jul 2025 07:59:45 +0000</pubDate>
      <link>https://forem.com/haakoned/extensive-reporting-of-well-architected-maturity-of4</link>
      <guid>https://forem.com/haakoned/extensive-reporting-of-well-architected-maturity-of4</guid>
      <description>&lt;h5&gt;
  
  
  Table of contents
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;How to generate a Well-Architected Compliance report&lt;/li&gt;
&lt;li&gt;What a Well-Architected Framework Compliance report looks like&lt;/li&gt;
&lt;li&gt;How the reporting feature works&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;li&gt;Feedback and contributions&lt;/li&gt;
&lt;li&gt;Resources&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In the post &lt;a href="https://hedrange.com/2025/04/15/how-to-measure-well-architected-maturity/" rel="noopener noreferrer"&gt;How to measure Well-Architected maturity&lt;/a&gt; we explored how extensive insight into cloud infrastructure posture could help accelerate the &lt;em&gt;Measure&lt;/em&gt; phase, leaving more time to &lt;em&gt;Learn&lt;/em&gt; and discuss opportunities for &lt;em&gt;Improvement&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ydvlnkdryn7v3vbu962.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ydvlnkdryn7v3vbu962.png" alt="AWS WAFR process" width="300" height="292"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure – Well-Architected Framework review cycle courtesy of AWS&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A key factor was to make the data available while performing a review in the Well-Architected Tool. After discussing the solution with colleagues and customers I realized that the valuable data points weren’t exposed to their full potential within only the Notes field in the Well-Architected Tool. Key constraints are that the Notes field is limited to plain text and a maximum of 2000 characters. Valuable resource identifiers and tags had to be capped, which made it harder to easily identify the applicable resources. Could there be a better way?&lt;/p&gt;

&lt;p&gt;I decided to develop an additional AWS Lambda function called &lt;code&gt;well_architected_report_generator&lt;/code&gt;. The main purpose of this Lambda function is to collect all the available data points from various sources and generate a report in HTML format stored in Amazon Simple Storage Service (S3). When performing Well-Architected Framework Reviews, you are most likely already logged in to the AWS Console to access the Well-Architected Tool, so opening an additional tab with S3 could be useful. Thanks to Amazon Q Developer the HTML and CSS came out pretty good, too!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Current data sources:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS Config Conformance Packs&lt;/li&gt;
&lt;li&gt;AWS Trusted Advisor checks (available checks depends on active AWS Support Plan)&lt;/li&gt;
&lt;li&gt;Resource Tag: Name&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  How to generate a Well-Architected Compliance report
&lt;/h2&gt;

&lt;p&gt;Deploy the Terraform module as also described in &lt;a href="https://hedrange.com/2025/04/15/how-to-measure-well-architected-maturity/" rel="noopener noreferrer"&gt;How to measure Well-Architected Maturity&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Please note that three new (optional) Terraform module variables have been introduced:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;variable "deploy_aws_config_recorder" {
  description = "Set to true to deploy an AWS Config Recorder. If you already have a customer managed AWS Config recorder in the desired region, set to false. AWS supports only one customer managed configuration recorder for each account for each AWS Region."
  type = bool
  default = true
}

variable "reports_bucket_name_prefix" {
  description = "Prefix for the S3 bucket name that stores Well-Architected compliance reports"
  type = string
  default = "well-architected-compliance-reports"
}

variable "reports_retention_days" {
  description = "Number of days to retain non-current versions of reports in the S3 bucket"
  type = number
  default = 90
}&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;variable "deploy_aws_config_recorder" {
  description = "Set to true to deploy an AWS Config Recorder. If you already have a customer managed AWS Config recorder in the desired region, set to false. AWS supports only one customer managed configuration recorder for each account for each AWS Region."
  type = bool
  default = true
}

variable "reports_bucket_name_prefix" {
  description = "Prefix for the S3 bucket name that stores Well-Architected compliance reports"
  type = string
  default = "well-architected-compliance-reports"
}

variable "reports_retention_days" {
  description = "Number of days to retain non-current versions of reports in the S3 bucket"
  type = number
  default = 90
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At the time of writing AWS only supports one customer managed AWS Config recorder in each region. If you already have one, set this value to false, for re-use.&lt;/p&gt;

&lt;p&gt;If you would like to retain the reports for longer than the default value of 90 days you may set the desired value accordingly.&lt;/p&gt;

&lt;p&gt;A minimal example module call if you already have a customer managed AWS Config recorder in your region and you prefer to retain the report files for 400 days may look like this:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;module "well_architected_config_conformance_pack" {
  source = "git::https://github.com/soprasteria/terraform-aws-wellarchitected-conformance.git?ref=&amp;amp;lt;DESIRED-COMMIT-SHA&amp;amp;gt;"
  deploy_aws_config_recorder = false
  reports_retention_days = 400
}&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;module "well_architected_config_conformance_pack" {
  source = "git::https://github.com/soprasteria/terraform-aws-wellarchitected-conformance.git?ref=&amp;lt;DESIRED-COMMIT-SHA&amp;gt;"
  deploy_aws_config_recorder = false
  reports_retention_days = 400
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait 24 hours for data to be collected and aggregated.&lt;/p&gt;

&lt;p&gt;Then, in the AWS Console, go to AWS Lambda and locate the function well_architected_report_generator.&lt;br&gt;&lt;br&gt;
 &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff18rgg3ri86mfvm8kmkb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff18rgg3ri86mfvm8kmkb.png" width="800" height="111"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create a new Test function with a JSON payload of workload_id for the Well-Architected Tool (not the complete ARN, just the last part) and dry_run. Example payload:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;textarea tabindex="-1" aria-hidden="true" readonly&amp;gt;{
  "workload_id": "141970ea95fd5b4329cyh05502659f39",
  "dry_run": 0
}&amp;lt;/textarea&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "workload_id": "141970ea95fd5b4329cyh05502659f39",
  "dry_run": 0
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hit Test. Execution may take a minute or two, depending on the amount of AWS infrastructure resources deployed in the AWS account.&lt;/p&gt;

&lt;p&gt;The Cloud Watch Logs output will let you know which AWS Support plan is detected and where you can find the produced report.&lt;br&gt;&lt;br&gt;
&lt;em&gt;Starting Well-Architected Report Generator&lt;br&gt;Collecting compliance data from AWS Config&lt;br&gt;Trusted Advisor compliance status mapping&lt;br&gt;AWS Business or Enterprise Support is/is not enabled&lt;br&gt;Successfully uploaded report to s3://well-architected-compliance-reports-123456789012/Reports/well_architected_compliance_report_%timestamp%.html&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Navigate to Amazon S3, find the most recent report by sorting on the Last modified column, check the box to the left of Name and click Open to access the report in a new browser tab.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbbpa7awwl0iff5v7xu7s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbbpa7awwl0iff5v7xu7s.png" width="800" height="192"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What a Well-Architected Framework Compliance report looks like
&lt;/h2&gt;

&lt;p&gt;The produced report contains information about the time of generation, the AWS Account ID, region and detected AWS Support plan.&lt;/p&gt;

&lt;p&gt;Example report 1, an AWS account with Basic Support:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzk95j5y2nr8aeivul8u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzk95j5y2nr8aeivul8u.png" width="800" height="203"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Example report 2, an AWS account with Enterprise Support:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feo1h1zble6joaxy9hjos.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feo1h1zble6joaxy9hjos.png" width="800" height="214"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The report may contain a lot of detailed information, so an Executive Summary is provided as a high-level overview.&lt;/p&gt;

&lt;p&gt;Example report 1:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9w7dbn7o6mge2sx6mqaq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9w7dbn7o6mge2sx6mqaq.png" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Example report 2:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7996xelhntozr5ty7zjy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7996xelhntozr5ty7zjy.png" width="800" height="486"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next a Table of Contents is provided which includes an overview of the available Well-Architected Framework pillars and questions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpwio2wz1pwio2j5de1ml.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpwio2wz1pwio2j5de1ml.png" width="800" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each question includes relevant context information from the Well-Architected Framework and link to further guidance.&lt;/p&gt;

&lt;p&gt;Here is the second question in the Security pillar: “How do you manage identifies for people and machines?”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv9218687xhophbmpmtmf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv9218687xhophbmpmtmf.png" width="800" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For AWS Accounts with available AWS Premium Support, relevant Trusted Advisor check information is included.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffjt06ciftyaxq85mnlou.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffjt06ciftyaxq85mnlou.png" width="800" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Relevant AWS Config checks and their status are displayed, along with the detected Resource Type, Resource ID and Tag Name, if available, for easy identification and discussion.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtrckn1143w8vf85adxn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtrckn1143w8vf85adxn.png" width="800" height="357"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For Reliability Pillar question number 9, Trusted Advisor checks indicates that RDS backups are enabled for all clusters, but there is at least one S3 bucket where replication is not configured.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvkzqu4w802xe4k83rvhz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvkzqu4w802xe4k83rvhz.png" width="800" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The next section lists all detected AWS resources, including Resource ID, Status and Tag Name, as available (certain information has been obfuscated on purpose).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foklgxw93lnrylmzlbpl3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foklgxw93lnrylmzlbpl3.png" width="800" height="150"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Moving to Cost Optimization pillar question number three: “How do you monitor cost and usage?” actually has no official Trusted Advisor checks associated, but the Terraform module fills the gap with custom AWS Config checks.&lt;/p&gt;

&lt;p&gt;Based on this we can easily see that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Cost Anomaly Detection is configured with a Cost monitor.&lt;/li&gt;
&lt;li&gt;There is at least one AWS Budget configured with alert subscriptions.&lt;/li&gt;
&lt;li&gt;There are no EC2 instances not in any Auto Scaling Groups.&lt;/li&gt;
&lt;li&gt;The AWS account is a member of AWS Organizations and a Tag Policy is in effect.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F11m5x6m3ty2mhhi8s4ze.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F11m5x6m3ty2mhhi8s4ze.png" width="800" height="603"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How the reporting feature works
&lt;/h2&gt;

&lt;p&gt;Let’s take a closer look into the underlying logic of the &lt;code&gt;well_architected_report_generator&lt;/code&gt; Lambda function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uzoxbqcksobnme9omjo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uzoxbqcksobnme9omjo.png" width="800" height="807"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Diagram created by the help of AWS Diagram and Documentation MCP Servers&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;After the Terraform module is deployed AWS Config needs 24 hours to collect and aggregate all data points. Then when the Lambda function is triggered, the following happens:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The starting point is a &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/userguide/workloads.html" rel="noopener noreferrer"&gt;Workload&lt;/a&gt; in the &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/userguide/intro.html" rel="noopener noreferrer"&gt;Well-Architected Tool&lt;/a&gt;. It’s not necessary to answer all questions first, you can do that with your team after the initial report has been generated. Every question in each pillar will then be mapped to relevant AWS Config and Trusted Advisor checks. Note: There is currently not a 100% 1-1 mapping with AWS Config and Trusted Advisor checks, but the availability may increase going forward as more custom checks are added to the Terraform module (contributions are welcome) and AWS Trusted Advisor.&lt;/li&gt;
&lt;li&gt;Compliance status and information for all mapped checks are collected and aggregated.
Resources in scope are fetched along with Tag Name, as available.
Contextual information and further guidance is retrieved.&lt;/li&gt;
&lt;li&gt;Scores and percentages are calculated.&lt;/li&gt;
&lt;li&gt;Data and information are grouped by pillar.&lt;/li&gt;
&lt;li&gt;The Executive Summary is generated. &lt;/li&gt;
&lt;li&gt;The report is produced based on Python Jinja2 templating functionality and uploaded to a dedicated bucket on Amazon S3. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This functionality is part of the Terraform module in the file &lt;a href="https://github.com/soprasteria/terraform-aws-wellarchitected-conformance/blob/main/wa_report_generator.tf" rel="noopener noreferrer"&gt;wa_report_generator.tf&lt;/a&gt; which includes the AWS Lambda function, a dedicated AWS KMS Key for encrypting the compliance reports (as resource and account information may be considered as sensitive) and a dedicated Amazon S3 bucket.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The solution now provides even more valuable insight to accelerate Well-Architected Framework Review conversations, providing more time to discuss opportunities for improvement. It fills some gaps if you don’t have AWS Premium Support available, and adds additional value otherwise. It can easily be deployed in existing Terraform pipelines. Please note that all resources will be cleaned up and deleted upon module call removal, including the reports, so make sure you remember &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/rel-09.html" rel="noopener noreferrer"&gt;REL09&lt;/a&gt; and back up relevant reports accordingly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Feedback and contributions
&lt;/h2&gt;

&lt;p&gt;If you have any feedback &lt;a href="https://hedrange.com/about" rel="noopener noreferrer"&gt;please let me know&lt;/a&gt; through your preferred medium of contact.&lt;/p&gt;

&lt;p&gt;If you would like to contribute with bugfixes, additional functionality or check coverage, &lt;a href="https://github.com/soprasteria/terraform-aws-wellarchitected-conformance/pulls" rel="noopener noreferrer"&gt;pull requests&lt;/a&gt; are welcome!&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/config/" rel="noopener noreferrer"&gt;AWS Config&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/conformance-packs.html" rel="noopener noreferrer"&gt;AWS Config Conformance Packs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/welcome.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/well-architected-tool/" rel="noopener noreferrer"&gt;AWS Well-Architected Tool&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/soprasteria/terraform-aws-wellarchitected-conformance/" rel="noopener noreferrer"&gt;GitHub: Terraform module terraform-aws-wellarchitected-conformance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Development was accelerated by &lt;a href="https://aws.amazon.com/q/developer/" rel="noopener noreferrer"&gt;Amazon Q Developer&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://awslabs.github.io/mcp/servers/aws-diagram-mcp-server/" rel="noopener noreferrer"&gt;AWS Diagram MCP Server&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://awslabs.github.io/mcp/servers/aws-documentation-mcp-server/" rel="noopener noreferrer"&gt;AWS Documentation MCP Server&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2025/07/20/extensive-reporting-of-well-architected-maturity/" rel="noopener noreferrer"&gt;Extensive reporting of Well-Architected Maturity&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com/" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>aws</category>
      <category>wellarchitected</category>
    </item>
    <item>
      <title>How to measure Well-Architected maturity?</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Tue, 15 Apr 2025 08:14:40 +0000</pubDate>
      <link>https://forem.com/haakoned/how-to-measure-well-architected-maturity-ikp</link>
      <guid>https://forem.com/haakoned/how-to-measure-well-architected-maturity-ikp</guid>
      <description>&lt;p&gt;&lt;strong&gt;TABLE OF CONTENTS&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;Challenge&lt;/li&gt;
&lt;li&gt;
Measuring Well-Architected maturity with Terraform and AWS Config

&lt;ul&gt;
&lt;li&gt;Functional flow of the solution&lt;/li&gt;
&lt;li&gt;Conceptual AWS architecture diagram&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

How to deploy and utilize

&lt;ul&gt;
&lt;li&gt;Viewing measurement insights in AWS Console&lt;/li&gt;
&lt;li&gt;Well-Architected Tool integration&lt;/li&gt;
&lt;li&gt;Event JSON examples for dry_run/live mode&lt;/li&gt;
&lt;li&gt;Event JSON for cleaning notes fields for all questions&lt;/li&gt;
&lt;li&gt;Notice about compliance checks and automation&lt;/li&gt;
&lt;li&gt;Cost of AWS Config evaluations&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;How to remove and decommission after use&lt;/li&gt;

&lt;li&gt;

Behind the scenes

&lt;ul&gt;
&lt;li&gt;AWS Config resources&lt;/li&gt;
&lt;li&gt;AWS Config Conformance Packs&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Feedback and contributions&lt;/li&gt;

&lt;li&gt;Resources&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=femopq3JWJg&amp;amp;t=5537" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2hogfr1n65z745flxjdr.png" width="800" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“I always think that you should be asking yourself:&lt;/em&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;Are you Well-Architected?”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=femopq3JWJg&amp;amp;t=5537" rel="noopener noreferrer"&gt;Dr. Werner Vogels, AWS re:Invent keynote 2018&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Being Well-Architected means that you have taken care of the basics and your foundation is solid, so that you can move fast and focus on business requirements, with decreased risk of surprises. But how can we answer Werner’s question, with confidence? How can we &lt;em&gt;measure&lt;/em&gt; Well-Architected maturity in a tangible manner? And what is Well-Architected &lt;em&gt;enough&lt;/em&gt;, in your project phaze?&lt;/p&gt;

&lt;p&gt;As the first part of the journey I would suggest to &lt;a href="https://hedrange.com/2023/12/18/move-fast-and-avoid-surprises-be-well-architected/" rel="noopener noreferrer"&gt;plan and conduct a Well-Architected Framework Review&lt;/a&gt;; to have a conversation about your solution architecture and how the different best practices from AWS could apply in your context (Measure).&lt;/p&gt;

&lt;p&gt;In most review conversations I observe teams are spending a substantial amount of time trying to understand if the designed or provisioned resources are meeting the AWS recommended best practice configurations. “Did we set up alerting for this scenario?”, “Did we configure encryption at rest for the database cluster?”, “Did anyone get any alerts about spiking costs?”, “Is our documentation still up to date?” and so on.&lt;/p&gt;

&lt;p&gt;During a conversation, spending less efforts on &lt;em&gt;Measure&lt;/em&gt; could provide us with more time to &lt;em&gt;Learn&lt;/em&gt; about best practices and align on opportunities for &lt;em&gt;Improvement&lt;/em&gt;. Personally, I don’t believe that automation and AI capabilities will fully replace the WAFR lifecycle, but they may help &lt;em&gt;accelerate&lt;/em&gt; them, reducing Mean-Time To Deployed Improvement for your users (MTTDI) [yes, I just invented that term].&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ydvlnkdryn7v3vbu962.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ydvlnkdryn7v3vbu962.png" alt="AWS WAFR process" width="300" height="292"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure – Well-Architected Framework review cycle courtesy of AWS&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Challenge
&lt;/h2&gt;

&lt;p&gt;You could use a variety of options for measuring if cloud resources are meeting AWS best practices. Most commonly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/security-hub/" rel="noopener noreferrer"&gt;AWS Security Hub&lt;/a&gt; with the &lt;a href="https://docs.aws.amazon.com/securityhub/latest/userguide/fsbp-standard.html" rel="noopener noreferrer"&gt;AWS Foundational Security Best Practices&lt;/a&gt; control reference (highly recommended).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/premiumsupport/technology/trusted-advisor/" rel="noopener noreferrer"&gt;AWS Trusted Advisor&lt;/a&gt; (full set of checks requires Business or Enterprise Support from AWS).&lt;/li&gt;
&lt;li&gt;3rd party open-source tools such as &lt;a href="https://prowler.com/" rel="noopener noreferrer"&gt;Prowler&lt;/a&gt; and &lt;a href="https://steampipe.io/" rel="noopener noreferrer"&gt;Steampipe&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;3rd party SaaS vendors offering similar functionality in APM/Observability services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some of these options may not be available during a Well-Architected Framework Review due to company policies on changes potentially affecting the entire AWS Organization, cost development, security validations or procurement.&lt;/p&gt;

&lt;p&gt;But, what if AWS native technology, provisioned for a limited time period, would be acceptable? In this article I will share an possible approach to measure Well-Architected maturity in the form of AWS Config Conformance packs.&lt;/p&gt;
&lt;h2&gt;
  
  
  Measuring Well-Architected maturity with Terraform and AWS Config
&lt;/h2&gt;

&lt;p&gt;Recently I have been working on a Terraform module which can be utilized in scenarios with constraints as described above. A particular reflection I’ve made is that most tools focus primarily on security and reliability. Some dedicated offerings focus solely on Cloud Financial Management and Cost Optimization (also called FinOps), but finding one complete &lt;a href="https://en.wikipedia.org/wiki/Commercial_off-the-shelf" rel="noopener noreferrer"&gt;COTS&lt;/a&gt; Solution To Rule Them All (that doesn’t charge a premium) is unlikely.&lt;/p&gt;

&lt;p&gt;If we wrap up our sleeves and develop our own solution supporting our own custom logic we could also cover other aspects such as Cost Optimization.&lt;/p&gt;

&lt;p&gt;This Terraform module deploys AWS Config Conformance Packs mapped to pillars in the Well-Architected Framework.&lt;/p&gt;

&lt;p&gt;For relevant pillars in the &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/welcome.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework&lt;/a&gt;, each best practice that is specific enough to be detected will report to be COMPLIANT or NON_COMPLIANT. Some best practices are harder to measure, or up to subjective consideration if a team is happy with how things are, or if the team considers there is room for improvement:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How a team evaluates culture and priorities.&lt;/li&gt;
&lt;li&gt;How satisfied a team is with insight into their workload(s) or business continuity and disaster recovery planning.&lt;/li&gt;
&lt;li&gt;How to practice cloud financial management.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Best practices in Operational Excellence are not straight forward to detect, as implementation of observability may have subjective opinion on room for improvement or may be performed with 3rd party tools. The main outcome of this module is to accelerate the Well-Architected Framework Review conversation, not to replace it with automation. Our hope is to shift the focus from “how did we configure this?” to “this is where we are today, what could we do to improve?”, thus freeing up valuable time for busy teams.&lt;/p&gt;

&lt;p&gt;In addition, the Notes field in the Well-Architected Tool can be populated directly with AWS Config resource compliance check results, leaving you with more insight to discuss improvement actions.&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Well-Architected Framework Pillar&lt;/strong&gt; | &lt;strong&gt;Status as of April 2025&lt;/strong&gt; |&lt;br&gt;
| Operational Excellence | 0 checks |&lt;br&gt;
| Security (majority of checks) | 128 checks |&lt;br&gt;
| Reliabilit | 69 checks |&lt;br&gt;
| Performance Efficiency | 0 checks |&lt;br&gt;
| Cost Optimization | 6 checks |&lt;br&gt;
| Sustainability | 0 checks |&lt;/p&gt;
&lt;h4&gt;
  
  
  Functional flow of the solution
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4y5xg9fve1di19twzlxv.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4y5xg9fve1di19twzlxv.jpg" width="500" height="1024"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure – Flow sequence accelerating WAFR conversations&lt;/em&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Conceptual AWS architecture diagram
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;AWS Config Configuration Recorder
• Records configuration changes for resources in your local AWS account (no impact or dependencies on AWS Organizations Config recorder(s))
• Set to record either daily or continuously (configurable)
• Stores configuration snapshots in a dedicated Amazon S3 bucket&lt;/li&gt;
&lt;li&gt;Amazon S3 Bucket
• Stores AWS Config configuration snapshots
• Stores CloudFormation templates for conformance packs
• Encrypted with a dedicated KMS key&lt;/li&gt;
&lt;li&gt;AWS Config Conformance Packs
• Well-Architected-Security
• Well-Architected-Reliability
• Well-Architected-Cost-Optimization
• Well-Architected-IAM (optional, subset of Security checks)&lt;/li&gt;
&lt;li&gt;Custom Lambda Functions
• Cost Optimization checks:
• Account structure implementation
• AWS Budgets configuration
• AWS Cost Anomaly Detection
• Organization information in cost and usage
• EC2 instances without Auto Scaling Groups&lt;/li&gt;
&lt;li&gt;Well-Architected Tool Updater Lambda function
• Retrieves compliance data from AWS Config
• Maps compliance results to specific Well-Architected Framework best practices
• Updates Notes fields in Well-Architected Tool&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08ob11xxha4o3bhe4s5c.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08ob11xxha4o3bhe4s5c.jpg" width="800" height="721"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  How to deploy and utilize
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;At least two days before your planned review, deploy the module as suggested in &lt;a href="https://github.com/soprasteria/terraform-aws-wellarchitected-conformance/blob/main/examples/main.tf" rel="noopener noreferrer"&gt;examples/main.tf&lt;/a&gt; and described below. Compliance checks will update on a daily basis, to optimize costs for AWS Config Evaluations.&lt;/li&gt;
&lt;li&gt;Right before the review, trigger the Lambda function well_architected_tool_updater to update the Well-Architected Tool workload notes sections based on AWS Config Conformance packs compliance status.&lt;/li&gt;
&lt;li&gt;Run the review, look to the data in the notes field for discussion. No checked/answered questions will be modified, that would be up to subjective evaluation.&lt;/li&gt;
&lt;/ol&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;provider "aws" {
  region = "eu-west-1" # Change to your preferred region
}

module "well_architected_conformance" {
  source = "git::https://github.com/soprasteria/terraform-aws-wellarchitected-conformance.git?ref=c006f439fc07d2e898cc7f67c5e7bcad1dcbd2e8"

  # AWS Config recording configuration
  recording_frequency = "DAILY" # Use DAILY to reduce costs

  # Deploy conformance packs
  deploy_security_conformance_pack = true
  deploy_reliability_conformance_pack = true
  deploy_cost_optimization_conformance_pack = true
  deploy_iam_conformance_pack = true
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Viewing measurement insights in AWS Console
&lt;/h3&gt;

&lt;p&gt;Navigating to AWS Config – Conformance packs will present a dashboard with packs for the Security, Reliability and Cost Optimization Pillars by default, plus IAM for Identity and Access Management, if enabled.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flopyc7a1lz1qn9avl4d0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flopyc7a1lz1qn9avl4d0.png" width="800" height="526"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can view the compliance score trend for each pillar/pack:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffcyaii03gkk8g55xf2xw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffcyaii03gkk8g55xf2xw.png" width="800" height="550"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can also view the compliance status for each check, prefixed with the related best practice question, mapped to the &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/the-pillars-of-the-framework.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework whitepaper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7l1ioleossacimi0nrec.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7l1ioleossacimi0nrec.png" width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Well-Architected Tool integration
&lt;/h3&gt;

&lt;p&gt;This module can also automatically update Well-Architected Tool workloads with compliance data from the AWS Config Conformance Packs.&lt;/p&gt;

&lt;p&gt;The Lambda function well_architected_tool_updater will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Process each conformance pack (Security, Reliability, Cost Optimization).&lt;/li&gt;
&lt;li&gt;Loop through all rules in sequence (SEC01, SEC02, REL01, REL02, COST01, etc.).&lt;/li&gt;
&lt;li&gt;For each rule, list the resource type, resource ID, and compliance status in the Notes field of the corresponding best practice question of your Well-Architected Tool workload.

&lt;ul&gt;
&lt;li&gt;The notes field has a limitation of maximum 2084 characters. When more resources are discovered than there is room for, resources will be summarized. &lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Overwrite old data if triggered more than once. &lt;/li&gt;
&lt;li&gt;If you would like to erase all contents in all notes field, set the &lt;code&gt;clean_notes&lt;/code&gt; input parameter to 1.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The source code for the Lambda function is located in the &lt;a href="https://github.com/soprasteria/terraform-aws-wellarchitected-conformance/blob/main/src/wa_tool_updater" rel="noopener noreferrer"&gt;src/wa_tool_updater&lt;/a&gt; directory.&lt;/p&gt;

&lt;p&gt;To trigger the Well-Architected Tool updater, go to Well-Architected Tool and extract the Workload ID (not the full resource ARN).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mbzc13p6keyrg9k9mxg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mbzc13p6keyrg9k9mxg.png" width="800" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then go to AWS Lambda and find the function well_architected_tool_updater. Create test event JSON definition as follows (Console or CLI):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl9uiow9a72x51iypi1x5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl9uiow9a72x51iypi1x5.png" width="800" height="295"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Event JSON examples for dry_run/live mode
&lt;/h4&gt;

&lt;p&gt;Extract the Well-Architected Tool Workload ID from Properties – ARN. This example with &lt;code&gt;dry_run = 1&lt;/code&gt; will find relevant compliance data and log to CloudWatch Logs. No changes or updates will be performed.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "workload_id": "141970ea95fd5b4329cea05202659f39",
  "dry_run": 1,
  "clean_notes": 0
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Flipping &lt;code&gt;dry_run = 0&lt;/code&gt; will perform updates of the notes field. No checked/answered questions will be modified.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "workload_id": "141970ea95fd5b4329cea05202659f39",
  "dry_run": 0,
  "clean_notes": 0
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Event JSON for cleaning notes fields for all questions
&lt;/h4&gt;

&lt;p&gt;If you end up with a lot of mess and would like a fresh start, setting clean_notes to 1 will clean the notes field for all questions and return. No further changes to checked/answered questions or compliance data updates will be performed.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "workload_id": "141970ea95fd5b4329cea05202659f39",
  "dry_run": 1,
  "clean_notes": 1
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected output is as follows. Full log output is available in Cloudwatch Logs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F33ox7fpmiomjgt36if5v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F33ox7fpmiomjgt36if5v.png" width="800" height="510"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Back in Well-Architected Tool, the notes field will now be updated with detected compliance for &lt;em&gt;SEC 4. How do you detect and investigate security events?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffsomoyc7bwbmvpi960x1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffsomoyc7bwbmvpi960x1.png" width="796" height="654"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foapvlk59oczetg83aey2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foapvlk59oczetg83aey2.png" width="773" height="1024"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Notice about compliance checks and automation
&lt;/h4&gt;

&lt;p&gt;Check data is based on all resources in the current AWS account. Tagging based filtering is currently not supported. Be aware if you have multiple workloads in the same AWS account.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost of AWS Config evaluations
&lt;/h3&gt;

&lt;p&gt;According to the &lt;a href="https://aws.amazon.com/config/pricing/" rel="noopener noreferrer"&gt;AWS Config pricing page&lt;/a&gt;; &lt;em&gt;“With AWS Config, you are charged based on the number of configuration items recorded, the number of active AWS Config rule evaluations, and the number of conformance pack evaluations in your account. A configuration item is a record of the configuration state of a resource in your AWS account. An AWS Config rule evaluation is a compliance state evaluation of a resource by an AWS Config rule in your AWS account. A conformance pack evaluation is the evaluation of a resource by an AWS Config rule within the conformance pack”.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;AWS Config supports &lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/select-resources.html#select-resources-recording-frequency" rel="noopener noreferrer"&gt;Continuous recording and Daily recording&lt;/a&gt;. You can choose between Daily or Continuous by setting the desired value for the variable &lt;code&gt;recording_frequency&lt;/code&gt;, which defaults to DAILY.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to remove and decommission after use
&lt;/h2&gt;

&lt;p&gt;Some might see this solution as valuable long-term, others might have other tools coming in which overlaps.&lt;/p&gt;

&lt;p&gt;As this Terraform module deploys an S3 bucket for storing Config evaluations, the bucket must be emptied before it can be deleted.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Empty the bucket in the AWS Console.&lt;/li&gt;
&lt;li&gt;Remove the Terraform module call declaration from your code base.&lt;/li&gt;
&lt;li&gt;Trigger your CI/CD pipeline.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Behind the scenes
&lt;/h2&gt;

&lt;p&gt;Some Terraform snippets on how to deploy an AWS Config Conformance Pack&lt;/p&gt;

&lt;p&gt;How to write a custom AWS Config check.&lt;/p&gt;

&lt;h4&gt;
  
  
  AWS Config resources
&lt;/h4&gt;

&lt;p&gt;To avoid dependencies or conflicts with existing AWS Organization based AWS Config, this module deploys a dedicated AWS Config Recorder, which has to be started after provisioning.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Excerpts for illustration, not complete example, see main.tf

# AWS Config Delivery Channel to S3
resource "aws_config_delivery_channel" "well_architected" {
  name = "well_architected_config_delivery_channel"
  s3_bucket_name = module.aws_config_well_architected_recorder_s3_bucket.s3_bucket_id
  depends_on = [aws_config_configuration_recorder.well_architected]
}

# AWS Config Configuration Recorder with recording_frequency set by input variable
resource "aws_config_configuration_recorder" "well_architected" {
  name = "well-architected"
  role_arn = aws_iam_role.config_role.arn

  recording_group {
    all_supported = true
    include_global_resource_types = true
  }

  recording_mode {
    recording_frequency = var.recording_frequency
  }
}

# AWS Config retention configuration: Number of days AWS Config stores your historical information.
resource "aws_config_retention_configuration" "example" {
  retention_period_in_days = 400
}

# Manages status (recording / stopped) of an AWS Config Configuration Recorder.
resource "aws_config_configuration_recorder_status" "well_architected" {
  name = aws_config_configuration_recorder.well_architected.name
  is_enabled = true
  depends_on = [aws_config_delivery_channel.well_architected]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  AWS Config Conformance Packs
&lt;/h4&gt;

&lt;p&gt;Security, Reliability and IAM conformance packs are based on AWS’ library of &lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/conformancepack-sample-templates.html" rel="noopener noreferrer"&gt;Conformance Pack Sample Templates for AWS Config&lt;/a&gt; (in Cloudformation format):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/operational-best-practices-for-wa-Security-Pillar.html" rel="noopener noreferrer"&gt;Operational Best Practices for AWS Well-Architected Framework Security Pillar&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/operational-best-practices-for-wa-Reliability-Pillar.html" rel="noopener noreferrer"&gt;Operational Best Practices for AWS Well-Architected Framework Reliability Pillar&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/operational-best-practices-for-aws-identity-and-access-management.html" rel="noopener noreferrer"&gt;Operational Best Practices for AWS Identity And Access Management&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The underlying checks are &lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/managed-rules-by-aws-config.html" rel="noopener noreferrer"&gt;AWS Config Managed Rules&lt;/a&gt; and cannot be edited. The Cloudformation templates are imported in Terraform as data objects. ConfigRuleNames are replaced to suit the particular Well-Architected Framework Pillar and best practice.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Excerpts for illustration, not complete example
locals {
  url_template_body_wa_security_pillar = "https://raw.githubusercontent.com/awslabs/aws-config-rules/refs/heads/master/aws-config-conformance-packs/Operational-Best-Practices-for-AWS-Well-Architected-Security-Pillar.yaml"
}

data "http" "template_body_wa_security_pillar" {
  url = local.url_template_body_wa_security_pillar
}

data "util_replace" "transformed_wa_security_pillar" {
  content = data.http.template_body_wa_security_pillar.response_body
  replacements = {
    "account-part-of-organizations" : "SEC01-securely-operate_bp_account-part-of-organizations",
    "ec2-instance-managed-by-systems-manager" : "SEC01-securely-operate_bp_ec2-instance-managed-by-systems-manager",
    "codebuild-project-envvar-awscred-check" : "SEC01-securely-operate_bp_codebuild-project-envvar-awscred-check",
    "mfa-enabled-for-iam-console-access" : "SEC02-identities_bp_mfa-enabled-for-iam-console-access"
     # .. and so on
    }
}

# Render templates to file on S3 to avoid template_body file limitation of 51,200 bytes
resource "aws_s3_object" "cloudformation_wa_config_security_template" {
  bucket = module.aws_config_well_architected_recorder_s3_bucket.s3_bucket_id
  key = "Cloudformation/wa-config-security.yaml"
  content = data.util_replace.transformed_wa_security_pillar.replaced
  content_type = "application/yaml"
}

# Takes the source Cloudformation file from S3, generates an AWS Config Conformance pack which behind the scenes creates an AWS managed Cloudformation stack. 
resource "aws_config_conformance_pack" "well_architected_conformance_pack_security" {
  count = var.deploy_security_conformance_pack ? 1 : 0
  name = "Well-Architected-Security"
  template_s3_uri = "s3://${module.aws_config_well_architected_recorder_s3_bucket.s3_bucket_id}/${aws_s3_object.cloudformation_wa_config_security_template.key}"
  depends_on = [aws_config_configuration_recorder.well_architected]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Cost Optimization Conformance Pack is built from scratch. &lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/evaluate-config_develop-rules.html" rel="noopener noreferrer"&gt;Custom Lambda Rules&lt;/a&gt; may be implemented like this:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# AWS Lambda function based on module from terraform-aws-modules
module "lambda_function_wa_conformance_cost_03_aws_budgets" {
  source = "git::https://github.com/terraform-aws-modules/terraform-aws-lambda.git?ref=f7866811bc1429ce224bf6a35448cb44aa5155e7"
  trigger_on_package_timestamp = false
  function_name = "WA-COST03-BP05-AWS-Budgets"
  description = "AWS Config Custom Rule which checks for AWS Budgets setup according to WAF COST03-BP05."
  handler = "index.lambda_handler"
  runtime = var.lambda_python_runtime
  source_path = "../../local-modules/wa-config-conformance/src/cost03_aws_budgets/index.py"
  attach_policy_statements = true
  timeout = var.lambda_timeout
  cloudwatch_logs_retention_in_days = var.lambda_cloudwatch_logs_retention_in_days
  policy_statements = {
    statement = {
      effect = "Allow"
      actions = [
        "budgets:DescribeBudgets",
        "budgets:ViewBudget",
        "config:PutEvaluations"
      ]
      resources = ["*"]
    }
  }

  tags = {
    Name = "Well-Architected-Conformance-COST03-BP05-AWS-Budgets"
  }
}

resource "aws_config_config_rule" "cost_01_aws_budgets" {
  name = "cost01-cloud-financial-management_bp_aws-budgets"
  description = "Checks for AWS Budgets setup according to WAF COST01-BP05 Report and notify on cost optimization."

  source {
    owner = "CUSTOM_LAMBDA"
    source_identifier = module.lambda_function_wa_conformance_cost_03_aws_budgets.lambda_function_arn

    source_detail {
      message_type = "ScheduledNotification"
      maximum_execution_frequency = var.scheduled_config_custom_lambda_periodic_trigger_interval
    }
  }

  depends_on = [module.lambda_function_wa_conformance_cost_03_aws_budgets]
}

# Lambda permissions for all AWS Config Custom Lambda Rules
resource "aws_lambda_permission" "config_permissions" {
  for_each = toset([
    module.lambda_function_wa_conformance_cost_02_account_structure_implemented.lambda_function_name,
    module.lambda_function_wa_conformance_cost_03_aws_budgets.lambda_function_name,
    module.lambda_function_wa_conformance_cost_03_aws_cost_anomaly_detection.lambda_function_name,
    module.lambda_function_wa_conformance_cost_03_add_organization_information_to_cost_and_usage.lambda_function_name,
    module.lambda_function_wa_conformance_cost_04_ec2_instances_without_auto_scaling.lambda_function_name
  ])

  statement_id = "AllowConfigInvoke"
  action = "lambda:InvokeFunction"
  function_name = each.value
  principal = "config.amazonaws.com"
  source_account = local.aws_account_id
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Feedback and contributions
&lt;/h2&gt;

&lt;p&gt;If you have any feedback, &lt;a href="https://dev.to/about"&gt;please let me know&lt;/a&gt; through your preferred medium of contact.&lt;/p&gt;

&lt;p&gt;If you would like to contribute with additional functionality and check coverage, &lt;a href="https://github.com/soprasteria/terraform-aws-wellarchitected-conformance/pulls" rel="noopener noreferrer"&gt;pull requests&lt;/a&gt; are welcome!&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/config/" rel="noopener noreferrer"&gt;AWS Config&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/config/latest/developerguide/conformance-packs.html" rel="noopener noreferrer"&gt;AWS Config Conformance Packs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/welcome.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/well-architected-tool/" rel="noopener noreferrer"&gt;AWS Well-Architected Tool&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/soprasteria/terraform-aws-wellarchitected-conformance/" rel="noopener noreferrer"&gt;GitHub: Terraform module terraform-aws-wellarchitected-conformance&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2025/04/15/how-to-measure-well-architected-maturity/" rel="noopener noreferrer"&gt;How to measure Well-Architected maturity?&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com/" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>aws</category>
      <category>wellarchitected</category>
      <category>measuring</category>
    </item>
    <item>
      <title>AWS re:Invent re:Cap talk – Simplifying developer experience with new features in AWS Step Functions</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Fri, 17 Jan 2025 07:43:58 +0000</pubDate>
      <link>https://forem.com/aws-builders/aws-reinvent-recap-talk-simplifying-developer-experience-with-new-features-in-aws-step-functions-5042</link>
      <guid>https://forem.com/aws-builders/aws-reinvent-recap-talk-simplifying-developer-experience-with-new-features-in-aws-step-functions-5042</guid>
      <description>&lt;p&gt;Monday January 13th I was invited by AWS User Group Oslo to participate in the traditional &lt;a href="https://www.meetup.com/AWS-User-Group-Norway/events/305162785" rel="noopener noreferrer"&gt;AWS re:Invent re:Cap meetup&lt;/a&gt;. My re:Cap contribution was a reflection on recent and valuable features in AWS Step Functions which can make the developer experience simpler and more efficient, reducing the time from idea to business value.&lt;/p&gt;

&lt;h4&gt;
  
  
  Main topics of my re:Cap talk
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;How to manage AWS Step Functions configuration with Infrastructure-as-Code&lt;/li&gt;
&lt;li&gt;Replacing application (Lambda) code with Step Functions workflow configuration&lt;/li&gt;
&lt;li&gt;Breaking apart a “Lambda-lith”&lt;/li&gt;
&lt;li&gt;Step Functions intrinsic functions&lt;/li&gt;
&lt;li&gt;New support for JSONata, in addition to JSONPath&lt;/li&gt;
&lt;li&gt;Walkthrough of JSONata native functionality&lt;/li&gt;
&lt;li&gt;New support for Variables&lt;/li&gt;
&lt;li&gt;Step Functions Distributed Map state&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;“&lt;em&gt;Customers choose Step Functions to build complex workflows that involve multiple services such as &lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt;, &lt;a href="https://aws.amazon.com/fargate/" rel="noopener noreferrer"&gt;AWS Fargate&lt;/a&gt;, &lt;a href="https://aws.amazon.com/bedrock/" rel="noopener noreferrer"&gt;Amazon Bedrock&lt;/a&gt;, and HTTP API integrations. Within these workflows, you build states to interface with these various services, passing input data and receiving responses as output. While you can use Lambda functions for date, time, and number manipulations beyond Step Functions’ intrinsic capabilities, these methods struggle with increasing complexity, leading to payload restrictions, data conversion burdens, and more state changes. This affects the overall cost of the solution. You use variables and JSONata to address this.&lt;/em&gt;“&lt;/p&gt;

&lt;p&gt;&lt;cite&gt;&lt;a href="https://aws.amazon.com/blogs/compute/simplifying-developer-experience-with-variables-and-jsonata-in-aws-step-functions/" title="" rel="noopener noreferrer"&gt;&lt;/a&gt;&lt;a href="https://aws.amazon.com/blogs/compute/simplifying-developer-experience-with-variables-and-jsonata-in-aws-step-functions/" title="" rel="noopener noreferrer"&gt;AWS Compute Blog: Simplifying developer experience with variables and JSONata in AWS Step Functions&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Source material references this re:Cap talk was based upon are listed below.&lt;/p&gt;

&lt;h4&gt;
  
  
  Event highlights
&lt;/h4&gt;

&lt;p&gt;Gunnar Grosch from AWS kicked off the event with highlights of new announcements from the pre:Invent and actual re:Invent period with a deeper dive on &lt;a href="https://aws.amazon.com/blogs/database/introducing-amazon-aurora-dsql/" rel="noopener noreferrer"&gt;Amazon Aurora DSQL&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuyawyru9jbyhfyzqqt6r.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuyawyru9jbyhfyzqqt6r.jpg" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then it was my turn before Colin from Capra Consulting shared his perception about the state of Platform Engineering.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmah1322o81olb24ceint.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmah1322o81olb24ceint.jpg" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyed3rg84kb7wml1m622a.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyed3rg84kb7wml1m622a.jpg" width="800" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42w5a672cds293h1e6qa.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42w5a672cds293h1e6qa.jpg" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqjzcodsyp8ktujb9m84i.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqjzcodsyp8ktujb9m84i.jpg" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymxta1781lx6i71qbms4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymxta1781lx6i71qbms4.jpg" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The meetup wrapped up with an interesting panel discussion with Martin from Sopra Steria, Gunnar from AWS, Anders from Webstep and Erlend from Capra.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvj27lawortuaxprztqnw.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvj27lawortuaxprztqnw.jpg" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Many thanks to the organizers for a great event.&lt;/p&gt;

&lt;h4&gt;
  
  
  Slides from my re:Cap talk
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://hedrange.com/wp-content/uploads/2025/01/2025-01-13_-_aws_user_group_oslo_reinvent_recap_-_simplifying_developer_experience_with_new_features_in_aws_step_functions_hed.pdf" rel="noopener noreferrer"&gt;Simplifying developer experience with new features in AWS Step Functions&lt;/a&gt;&lt;a href="https://hedrange.com/wp-content/uploads/2025/01/2025-01-13_-_aws_user_group_oslo_reinvent_recap_-_simplifying_developer_experience_with_new_features_in_aws_step_functions_hed.pdf" rel="noopener noreferrer"&gt;Download&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Reference material
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/step-functions/" rel="noopener noreferrer"&gt;AWS Step Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/blogs/compute/simplifying-developer-experience-with-variables-and-jsonata-in-aws-step-functions/" rel="noopener noreferrer"&gt;AWS Compute Blog: Simplifying developer experience with variables and JSONata in AWS Step Functions/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://serverlessland.com/reinvent2024/api402" rel="noopener noreferrer"&gt;https://serverlessland.com/reinvent2024/api402&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://serverlessland.com/reinvent2024/svs401" rel="noopener noreferrer"&gt;https://serverlessland.com/reinvent2024/svs401&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.youtube.com/playlist?list=PL2yQDdvlhXf_Ezjnq7A7LfHBgCYSqzrZS" rel="noopener noreferrer"&gt;https://www.youtube.com/playlist?list=PL2yQDdvlhXf_Ezjnq7A7LfHBgCYSqzrZS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://community.aws/recaps" rel="noopener noreferrer"&gt;AWS Community re:Caps incl. full deck download&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2025/01/17/aws-reinvent-recap-talk-simplifying-developer-experience-with-new-features-in-aws-step-functions/" rel="noopener noreferrer"&gt;AWS re:Invent re:Cap talk – Simplifying developer experience with new features in AWS Step Functions&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>aws</category>
      <category>stepfunctions</category>
    </item>
    <item>
      <title>Increase system reliability with Immutable Infrastructure – Move fast and avoid surprises</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Fri, 09 Aug 2024 17:52:53 +0000</pubDate>
      <link>https://forem.com/haakoned/increase-system-reliability-with-immutable-infrastructure-move-fast-and-avoid-surprises-130n</link>
      <guid>https://forem.com/haakoned/increase-system-reliability-with-immutable-infrastructure-move-fast-and-avoid-surprises-130n</guid>
      <description>&lt;p&gt;&lt;strong&gt;Table of contents&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The largest outage in history of IT (so far)&lt;/li&gt;
&lt;li&gt;Configuration drift&lt;/li&gt;
&lt;li&gt;The concept of Immutable Infrastructure&lt;/li&gt;
&lt;li&gt;
Immutable Infrastructure in the AWS Cloud

&lt;ul&gt;
&lt;li&gt;Virtual machine based workloads on EC2&lt;/li&gt;
&lt;li&gt;Container based workloads&lt;/li&gt;
&lt;li&gt;Immutability for Docker and local testing&lt;/li&gt;
&lt;li&gt;Immutability for workloads provisioned with AWS Elastic Container Service (ECS based on EC2 or Fargate)&lt;/li&gt;
&lt;li&gt;Immutability for workloads provisioned with AWS Elastic Kubernetes Service&lt;/li&gt;
&lt;li&gt;About Docker labels/image tags and deployment&lt;/li&gt;
&lt;li&gt;Immutability for serverless workloads provisioned with AWS Lambda&lt;/li&gt;
&lt;li&gt;Data handling in the immutable infrastructure model&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Conclusion&lt;/li&gt;

&lt;li&gt;References&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  The largest outage in history of IT (so far)
&lt;/h2&gt;

&lt;p&gt;On 19 July 2024, the cybersecurity company CrowdStrike distributed a faulty update to its Falcon Sensor security software that caused widespread problems with Microsoft Windows computers running the software. As a result, roughly 8.5 million systems crashed and were unable to properly restart in what has been called the &lt;a href="https://www.cnbc.com/2024/07/19/latest-live-updates-on-a-major-it-outage-spreading-worldwide.html" rel="noopener noreferrer"&gt;largest outage in the history&lt;/a&gt; of information technology and “historic in scale”.&lt;/p&gt;

&lt;p&gt;The outage disrupted daily life, businesses, and governments around the world. Many industries were affected; airlines, airports, banks, hotels, hospitals, manufacturing, stock markets, broadcasting, gas stations, retail stores, emergency services and governmental websites. The worldwide financial damage has been estimated to be at least $10 billion.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;What happened? *&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The CrowdStrike Falcon software suite consists, highly simplified, of an software application (agent), which is versioned, and Rapid Response Content configuration updates, the latter not being versioned, to facilitate rapid deployment. However, the outcome of this change was not quality assured properly before widespread deployment. In their &lt;a href="https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/" rel="noopener noreferrer"&gt;Preliminary Post Incident Review&lt;/a&gt;, CrowdStrike indicated they promised to improve their quality assurance process and provide customers with “greater control over the delivery of Rapid Response Content updates by allowing granular selection of when and where these updates are deployed.”&lt;/p&gt;

&lt;p&gt;I would like to highlight that there are certain nuances to this case (rapid distribution of security detection/mitigation mechanisms, client machines vs. servers etc.), but my personal stance is that any system that is providing mission critical services should have a level of quality assurance that matches it’s Service Level Agreement/Objective and risk tolerance.&lt;/p&gt;

&lt;p&gt;Outages like this are not something new. It can happen to any system where changes are applied live or in-place, without being tested properly in advance. Service disruptions can also happen during regular operating system patching procedures, if/when a bug or a security issue is introduced, or with any application or server process, depending on how privileged the process is. &lt;/p&gt;

&lt;h2&gt;
  
  
  Configuration drift
&lt;/h2&gt;

&lt;p&gt;Long running systems will over the course of time experience configuration drift. A clean server started from a golden image immediately has a high degree of certainty of it’s current state, but as time goes by, certainty decreases. In a traditional (legacy) deployment model, software packages are updated, cache and log files are stored on disk. Perhaps not all log file locations are configured for rotation, perhaps an application stores temp files at a non-standard location and not all temp files are cleaned up. Over time disk space utilization increases, and at some point in the critical partitions, such as /boot may become full without it being noticed. The server’s state has now deviated significantly from its initial state, and it may be hard to reproduce that same state. &lt;/p&gt;

&lt;p&gt;A software update that passed QA in staging is not guaranteed be successfully applied in production because of lacking disk space or some trash that has been lying around, accumulating over time, or custom, manual configurations which was supposed to “improve” something in the live environment. &lt;/p&gt;

&lt;p&gt;Let’s say you start comparing a green Granny Smith against a green Granny Smith, but over the course of time the green Granny Smith may have evolved to a Red Delicious apple, and perhaps to an orange Asian pear!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61f1ux2mj6kr6jdbbqez.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61f1ux2mj6kr6jdbbqez.jpg" width="739" height="1352"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For production systems, especially mission-critical ones, risk management becomes increasingly important. An operating model based on long running servers is prone to introduce risk and uncertainty as time goes by. Staging and production environments can over the course of time drift and not be identical anymore. You end up with a &lt;a href="https://martinfowler.com/bliki/SnowflakeServer.html" rel="noopener noreferrer"&gt;Snowflake Server&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Can the new state be predicted? Can the previously known state be reproduced? If not, you may have a challenge with rollbacks and disaster recovery.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;It is a good idea to virtually burn down your servers at regular intervals. A server should be like a phoenix, regularly rising from the ashes.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;cite&gt;&lt;a href="https://martinfowler.com/bliki/PhoenixServer.html" rel="noopener noreferrer"&gt;&lt;/a&gt;&lt;a href="https://martinfowler.com/bliki/PhoenixServer.html" rel="noopener noreferrer"&gt;https://martinfowler.com/bliki/PhoenixServer.html&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The concept of Immutable Infrastructure
&lt;/h2&gt;

&lt;p&gt;The Cambridge dictionary definition of Immutability is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The state of not changing, or being unable to be changed&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;cite&gt;&lt;a href="https://dictionary.cambridge.org/dictionary/english/immutability" rel="noopener noreferrer"&gt;&lt;/a&gt;&lt;a href="https://dictionary.cambridge.org/dictionary/english/immutability" rel="noopener noreferrer"&gt;https://dictionary.cambridge.org/dictionary/english/immutability&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;With the advent of cloud and more capabilities for automation at our disposal, an alternative pattern called Immutable Infrastructure has gained popularity. The concept is simply about starting from a clean, well-known slate, every time a change is performed.&lt;/p&gt;

&lt;p&gt;The concept of the Immutable Server, or disposable servers, was introduced around 2012 by actors such as Thoughtworks, Netflix and Google. Instead of using configuration management to try to keep systems in compliance, they advocated for using configuration management to create base images for servers that could be torn down and rebuilt at will.&lt;/p&gt;

&lt;p&gt;An Immutable Server is the logical conclusion of this approach, a server that once deployed, is never modified or changed. No software or operating system updates, security patches, application releases or configuration changes are being performed in-place on live production servers. “&lt;a href="https://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/" rel="noopener noreferrer"&gt;Treat your servers like cattle, not like pets&lt;/a&gt;” gained traction. If there was a problem with a live server, it would be terminated and replaced by a new one, from a known state.&lt;/p&gt;

&lt;p&gt;Not even new application releases/software artifacts, are deployed to existing servers. The running servers are replaced with new instances that has software artifacts built-in. With load balancing, automatic health checks and blue/green or canary capabilities, deploying new servers or rolling back to previous versions can be done without end user impact (if done correctly).&lt;/p&gt;

&lt;p&gt;This concept is also described in the AWS Well-Architected Framework – Reliability Pillar – &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/rel_tracking_change_management_immutable_infrastructure.html" rel="noopener noreferrer"&gt;REL08-BP04 Deploy using immutable infrastructure&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Implementing in-place changes to running infrastructure resources&lt;/em&gt;, the common approach, is actually stated as an anti-pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Benefits of establishing this best practice:&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;_ &lt;strong&gt;Increased consistency across environments:&lt;/strong&gt;  Since there are no differences in infrastructure resources across environments, consistency is increased and testing is simplified._&lt;/li&gt;
&lt;li&gt;_ &lt;strong&gt;Reduction in configuration drifts:&lt;/strong&gt;  By replacing infrastructure resources with a known and version-controlled configuration, the infrastructure is set to a known, tested, and trusted state, avoiding configuration drifts._&lt;/li&gt;
&lt;li&gt;_ &lt;strong&gt;Reliable atomic deployments:&lt;/strong&gt;  Deployments either complete successfully or nothing changes, increasing consistency and reliability in the deployment process._&lt;/li&gt;
&lt;li&gt;_ &lt;strong&gt;Simplified deployments:&lt;/strong&gt;  Deployments are simplified because they don’t need to support upgrades. Upgrades are just new deployments._&lt;/li&gt;
&lt;li&gt;_ &lt;strong&gt;Safer deployments with fast rollback and recovery processes:&lt;/strong&gt;  Deployments are safer because the previous working version is not changed. You can roll back to it if errors are detected._&lt;/li&gt;
&lt;li&gt;_ &lt;strong&gt;Enhanced security posture:&lt;/strong&gt;  By not allowing changes to infrastructure, remote access mechanisms (such as SSH) can be disabled. This reduces the attack vector, improving your organization’s security posture._&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Immutable Infrastructure in the AWS Cloud
&lt;/h2&gt;

&lt;p&gt;This practice can be achieved regardless of compute option, but to reduce time to market and operational overhead it mandates a high degree of automation with CI/CD tools such as AWS CodePipeline and GitHub Actions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Virtual machine based workloads on EC2
&lt;/h3&gt;

&lt;p&gt;In this scenario an automated routine is established which produces virtual machine images called EC2 Amazon Machine Images (AMI). This “golden image” includes the respective version of application source code and it’s dependencies such as operating system services, at the expected versions. AMIs can be produced without environment specific configuration built-in, to be fetched from &lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html" rel="noopener noreferrer"&gt;AWS Systems Manager Parameter Store&lt;/a&gt; and/or &lt;a href="https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html" rel="noopener noreferrer"&gt;AWS Secrets Manager&lt;/a&gt;, so that the same image can be deployed and verified in dev/test, staging and production environments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.packer.io/" rel="noopener noreferrer"&gt;Hashicorp Packer&lt;/a&gt; has been available for quite some time. AWS also provides the native service &lt;a href="https://aws.amazon.com/image-builder/" rel="noopener noreferrer"&gt;EC2 Image Builder&lt;/a&gt; for this purpose.&lt;/p&gt;

&lt;p&gt;Amazon provides managed AMIs for both Linux (&lt;a href="https://docs.aws.amazon.com/linux/al2023/ug/ec2.html" rel="noopener noreferrer"&gt;Amazon Linux&lt;/a&gt; based on Fedora) and &lt;a href="https://docs.aws.amazon.com/ec2/latest/windows-ami-reference/windows-ami-versions.html" rel="noopener noreferrer"&gt;Windows&lt;/a&gt; workloads that are tailored for security and performance in AWS.&lt;/p&gt;

&lt;p&gt;With Immutable Infrastructure, Windows Updates and Linux unattended-upgrades are disabled. Every time AWS releases a new officially supported base AMI version, or ad-hoc updates are required, the EC2 Image Builder Pipeline is triggered, which produces a new artifact and validates each change. Every new application version fetches the latest quality assured base AMI and produces a self-contained application AMI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7fapx499c9q77eaelmbd.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7fapx499c9q77eaelmbd.jpg" width="422" height="1024"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can find a Terraform example with inline comments and explanations of EC2 Image Builder resources below .&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# EC2 Image Builder component that installs Git and Nginx, clones a sample app repo from GitHub and starts the Nginx web server
resource "aws_imagebuilder_component" "hello_world" {
  name = "hello-world-component"
  platform = "Linux"
  version = "1.0.0"
  description = "Hello World application component"

  data {
    name = "hello-world-app-script"
    type = "AWS_LAMBDA"
    content = &amp;lt;&amp;lt;-EOF
      # Install Git and Nginx
      yum install -y git nginx

      # Clone sample Hello World app from GitHub
      mkdir app-repo
      git clone https://github.com/aws-samples/aws-codepipeline-s3-codedeploy-linux app-repo

      # Copy cloned files to Nginx public HTML directory
      cp -r app-repo/* /usr/share/nginx/html/

      # Start Nginx
      systemctl start nginx
    EOF
  }
}

# Recipe named nginx-recipe which includes the nginx-component.
# parent_image is set to Amazon Linux 2 AMI version 2024.7.20. This can be parameterized. 
resource "aws_imagebuilder_image_recipe" "nginx_recipe" {
  name = "nginx-recipe"
  parent_image = "arn:aws:imagebuilder:${var.region}:aws:image/amazon-linux-2-x86/2024.7.20"
  version = "1.0.0"

  component {
    component_arn = aws_imagebuilder_component.nginx.arn
  }
}

# Defines an infrastructure configuration, including the instance profile, security group, and subnet (intentionally not included)
resource "aws_imagebuilder_infrastructure_configuration" "nginx_infra" {
  name = "nginx-infra"
  instance_profile_name = aws_iam_instance_profile.image_builder_instance_profile.name
  security_group_ids = [aws_security_group.image_builder_sg.id]
  subnet_id = aws_subnet.image_builder_subnet.id
  terminate_instance_on_failure = true
}

# Pipeline output should be an Amazon Machine Image (AMI) with a name based on the build date.
resource "aws_imagebuilder_distribution_configuration" "nginx_distribution" {
  name = "nginx-distribution"

  distribution {
    ami_distribution_configuration {
      name = "nginx-ami-{{ imagebuilder:buildDate }}"
    }
    region = var.region
  }
}

# Defines the Image Builder pipeline, ties together the recipe, infrastructure configuration, and distribution configuration.
resource "aws_imagebuilder_image_pipeline" "nginx_pipeline" {
  name = "nginx-pipeline"
  image_recipe_arn = aws_imagebuilder_image_recipe.nginx_recipe.arn
  infrastructure_configuration_arn = aws_imagebuilder_infrastructure_configuration.nginx_infra.arn
  distribution_configuration_arn = aws_imagebuilder_distribution_configuration.nginx_distribution.arn
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Container based workloads
&lt;/h3&gt;

&lt;p&gt;In your CI/CD tool of choice, container images are produced and pushed to a container repository such as AWS Elastic Container Registry (ECR). Even though immutability is one of the foundations behind containers, it is not restricted, but based on how the container configuration is specified and launched. Container root filesystems are usually writable by default.&lt;/p&gt;

&lt;p&gt;In many cases software packages are being updated on container launch, but this is an anti-pattern in immutability. Update installed packages and install Apache could yield different results if executed at different points in time, as new upstream software package updates are made available. We need to ensure that we test the exact same artifact in staging that is being deployed to production, so the recommended approach is to update all packages at container build time and then launch it with the root storage partition in read only mode.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The container’s root filesystem should be treated as a ‘golden image’ by using Docker run’s &lt;code&gt;--read-only&lt;/code&gt; option. This prevents any writes to the container’s root filesystem at container runtime and enforces the principle of immutable infrastructure.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;cite&gt;&lt;a href="https://docs.datadoghq.com/security/default_rules/cis-docker-1.2.0-5.12/" title="" rel="noopener noreferrer"&gt;&lt;em&gt;CIS Docker Benchmark control 5.12 – Datadog&lt;/em&gt;&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Immutability for Docker and local testing
&lt;/h4&gt;

&lt;p&gt;Adding a read-only flag at the container’s runtime enforces the container’s root filesystem being mounted as read only. With the –tmpfs option it is possible to mount a temporary file system for non-persistent data/cache.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run &amp;lt;Run arguments&amp;gt; --read-only &amp;lt;Container Image Name or ID&amp;gt; &amp;lt;Command&amp;gt;

# Example with the --tmpfs option to mount a temporary file system for non-persistent data/cache
docker run --interactive --tty --read-only --tmpfs "/run" --tmpfs "/tmp" ubuntu /bin/bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Immutability for workloads provisioned with AWS Elastic Container Service (ECS based on EC2 or Fargate)
&lt;/h4&gt;

&lt;p&gt;Configure the &lt;a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#container_definitions" rel="noopener noreferrer"&gt;Amazon ECS Task Definition file&lt;/a&gt; to set parameter &lt;code&gt;readonlyRootFilesystem&lt;/code&gt;in section Storage and logging to &lt;code&gt;true&lt;/code&gt;, as the default value is &lt;code&gt;false&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Example &lt;a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ecs_task_definition" rel="noopener noreferrer"&gt;Terraform resource definition&lt;/a&gt;, see line 27:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_ecs_task_definition" "hardened_task_definition" {
  family = "hardened-task-definition"
  network_mode = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu = 256
  memory = 512

  volume {
    name = "efs-volume"
    efs_volume_configuration {
      file_system_id = "fs-0123456789abcdef" # Replace with your EFS file system ID
      root_directory = "/tmp"
    }
  }

  container_definitions = jsonencode([
    {
      name = "hardened-container"
      image = "nginx:latest"
      essential = true
      portMappings = [
        {
          containerPort = 8080
          hostPort = 8080
        }
      ]
      readonlyRootFilesystem = true
      volumesFrom = [
        {
          sourceContainer = "efs-volume-container"
        }
      ]
    },
    {
      name = "efs-volume-container"
      image = "amazon/amazon-efs-utils:latest"
      essential = true
      volumeMounts = [
        {
          name = "efs-volume"
          mountPath = "/tmp"
        }
      ]
    }
  ])
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this configuration, the /tmp directory inside the hardened-container will be mounted to the specified EFS file system, allowing temporary files to be stored on the persistent EFS file system instead of the read-only root filesystem.&lt;/p&gt;

&lt;h4&gt;
  
  
  Immutability for workloads provisioned with AWS Elastic Kubernetes Service
&lt;/h4&gt;

&lt;p&gt;This also applies to Kubernetes in general. In the manifest file, specify &lt;code&gt;securityContext&lt;/code&gt;: &lt;code&gt;readOnlyRootFilesystem: true&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Example Kubernetes manifest below which demonstrates read only configuration on lines 11-12:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: v1
kind: Pod
metadata:
  labels:
    run: nginx
  name: nginx
spec:
  containers:
    - image: nginx
      name: hardened-container
      securityContext:
        readOnlyRootFilesystem: true
      volumeMounts:
        - name: cache-volume
          mountPath: /var/cache/nginx
        - name: runtime-volume
          mountPath: /var/run
        - name: efs-volume
          mountPath: /tmp
  volumes:
    - name: cache-volume
      emptyDir: {}
    - name: runtime-volume
      emptyDir: {}
    - name: efs-volume
      nfs:
        server: efs-server.default.svc.cluster.local
        path: "/efs-share"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After applying this updated manifest, the Nginx container will have an Amazon EFS volume mounted at /tmp, allowing it to use the persistent storage provided by Amazon EFS for temporary files while still maintaining a read-only root filesystem.&lt;/p&gt;

&lt;p&gt;Leverage Kubernetes’ Deployment strategies for blue/green, &lt;a href="https://kubernetes.io/docs/concepts/workloads/management/#canary-deployments" rel="noopener noreferrer"&gt;canary&lt;/a&gt;etc. as best suits your use-case.&lt;/p&gt;

&lt;h4&gt;
  
  
  About Docker labels/image tags and deployment
&lt;/h4&gt;

&lt;p&gt;When new Docker images are produced and pushed to a container repository, a common approach is to add the label/tag “latest” to it (similar to HEAD in Git), and this is also referenced in CI/CD pipelines. However, with this approach it is challenging to answer exactly which Docker image was in production at any given time, since the reference to the actual image is lost.&lt;/p&gt;

&lt;p&gt;To follow through on immutability it is recommended to set a unique tag on every new image. You can reference the actual image tag checksum or some other application specific tag like hello-world-v1.2.3 for &lt;a href="https://www.cncf.io/blog/2021/09/28/gitops-101-whats-it-all-about/" rel="noopener noreferrer"&gt;GitOps&lt;/a&gt; and CI/CD deployment. If you also add a tag with the Git commit ID, debugging becomes much easier.&lt;/p&gt;

&lt;p&gt;It also becomes more transparent when dealing with rollbacks when you know hello-world-v1.2.3 was live in production, hello-world-v1.2.4 failed the canary health checks, the automatic procedure rolled back to hello-world-v1.2.3, and you can quickly look up the git commit of hello-world-v1.2.4 for the change.&lt;/p&gt;

&lt;p&gt;In Amazon Elastic Container Registry you can &lt;a href="https://docs.aws.amazon.com/AmazonECR/latest/userguide/image-tag-mutability.html" rel="noopener noreferrer"&gt;prevent image tags from being overwritten&lt;/a&gt; by enabling the property Tag immutability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Immutability for serverless workloads provisioned with AWS Lambda
&lt;/h3&gt;

&lt;p&gt;AWS Lambda is immutable by design. Lambda creates a new version of your function each time that you publish the function. It’s code, runtime, architecture, memory, layers, and most other configuration settings will remain unchanged.&lt;/p&gt;

&lt;p&gt;Versions can be used to control function deployment. You can publish a new version of a function for beta testing, or canary testing for a small amount of users instead of deploying the change to all production users at once by using a specific version reference (&lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/configuration-versions.html#versioning-versions-using" rel="noopener noreferrer"&gt;Qualified ARN&lt;/a&gt; instead of $LATEST), in AWS API Gateway or other services that are routing traffic to your Lambda functions.&lt;/p&gt;

&lt;p&gt;In the example below we define three &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/configuration-aliases.html" rel="noopener noreferrer"&gt;aliases&lt;/a&gt;: staging, canary-prod and prod, which refers to relevant Lambda function versions. A &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/configuring-alias-routing.html" rel="noopener noreferrer"&gt;Lambda routing configuration&lt;/a&gt; is defined which directs 90% of the traffic to alias &lt;code&gt;prod&lt;/code&gt; (v41), and 10% to alias &lt;code&gt;canary-prod&lt;/code&gt; (v42).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Procedure&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Test version 42 in staging.&lt;/li&gt;
&lt;li&gt;Roll out version 42 to 10% of all production users.&lt;/li&gt;
&lt;li&gt;CloudWatch Metrics for Lambda functions are observed and grouped by a combination of alias and executed version.&lt;/li&gt;
&lt;li&gt;If application monitoring and health checks are sustained over a course of eg. 30 minutes with no increases in error rates, update the alias for prod to increase function_version from 41 to 42 and reset routing_config. If not, roll back.&lt;/li&gt;
&lt;li&gt;If health checks are still OK, end. If not, roll back.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8u9m0ebcm0ipb4ylgmn0.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8u9m0ebcm0ipb4ylgmn0.jpg" width="768" height="1300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Terraform example of AWS Lambda aliases and routing configuration for canary deployment:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_lambda_alias" "staging" {
    name = "staging"
    function_name = "arn:aws:lambda:aws-region:123456789012:function:helloworld"
    function_version = "42"
    description = "Canary alias for production environment"
}

resource "aws_lambda_alias" "canary_prod" {
    name = "canary-prod"
    function_name = "arn:aws:lambda:aws-region:123456789012:function:helloworld"
    function_version = "42"
    description = "Canary alias for production environment"
  }

resource "aws_lambda_alias" "prod" {
  name = "prod"
  function_name = "arn:aws:lambda:aws-region:123456789012:function:helloworld"
  function_version = "41"
  description = "Alias for main production audience"
}

resource "aws_lambda_update_alias" "prod_routing_with_canary" {
  name = aws_lambda_alias.prod.name
  function_name = aws_lambda_alias.prod.function_name
  function_version = aws_lambda_alias.prod.function_version

  routing_config {
    additional_version_weights = {
      aws_lambda_alias.canary_prod.function_version = 0.1
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Data handling in the immutable infrastructure model
&lt;/h4&gt;

&lt;p&gt;As the compute resources themselves can come and go, data that needs to be persisted, including session state, has to be moved off of the compute tier. Depending on the type of data there are different AWS services at your disposal:&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Data type&lt;/strong&gt; | &lt;strong&gt;Recommended AWS service&lt;/strong&gt; (as starting point) |&lt;br&gt;
| Application session data/state | &lt;a href="https://aws.amazon.com/elasticache/" rel="noopener noreferrer"&gt;Amazon ElastiCache Redis/Memcached&lt;/a&gt; |&lt;br&gt;
| Application cache and temporary data | &lt;a href="https://aws.amazon.com/efs/" rel="noopener noreferrer"&gt;Amazon Elastic File System (EFS) [“NFS”]&lt;/a&gt; |&lt;br&gt;
| Relational data | &lt;a href="https://aws.amazon.com/rds/aurora/" rel="noopener noreferrer"&gt;Amazon Aurora&lt;/a&gt; etc. |&lt;br&gt;
| Non-relational data | &lt;a href="https://aws.amazon.com/dynamodb/" rel="noopener noreferrer"&gt;Amazon DynamoDB&lt;/a&gt; etc. |&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;By adhering to the principle of never changing a running system, a higher degree of predictability and reliability can be achieved.&lt;/p&gt;

&lt;p&gt;Many view rollbacks as a pain. Most people avoid spending time on rollback and disaster recovery testing. “We don’t do rollbacks, we prefer to try to fix the problem and roll forward”. This is a warning signal of manual procedures and lacking automation.&lt;/p&gt;

&lt;p&gt;With the immutable infrastructure approach rollbacks becomes a natural part of change management. Database changes can be managed with the &lt;a href="https://www.prisma.io/dataguide/types/relational/expand-and-contract-pattern" rel="noopener noreferrer"&gt;Expand-Contract pattern&lt;/a&gt;. Any failing automated checks with increased error rates should automatically roll back to the previously known working version, also supported by the database schema. By adopting canary releases and/or blue-green deployments rollbacks can be performed fast and with reduced end user impact.&lt;/p&gt;

&lt;p&gt;A bonus with the Immutable Infrastructure approach is that rollbacks will become trivial. By releasing small changes frequently a failed new release shouldn’t be a big issue. Instead of being up at night or feeling the pressure of having to fight fires to resolve issues which are impacting end users, teams can investigate in peace and quiet, produce a new artifact version with the bug-fix and ship it quickly through fully automated and quality assured CI/CD pipelines, to master the art of &lt;a href="https://continuousdelivery.com/principles/" rel="noopener noreferrer"&gt;Continuous Delivery&lt;/a&gt; and realize business value &lt;em&gt;faster&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://repost.aws/knowledge-center/ec2-instance-crowdstrike-agent" rel="noopener noreferrer"&gt;https://repost.aws/knowledge-center/ec2-instance-crowdstrike-agent&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://martinfowler.com/bliki/ImmutableServer.html" rel="noopener noreferrer"&gt;https://martinfowler.com/bliki/ImmutableServer.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/rel_tracking_change_management_immutable_infrastructure.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework REL08-BP04 Deploy using immutable infrastructure&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/rel_tracking_change_management_automated_changemgmt.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework REL08-BP05 Deploy changes with automation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/operational-excellence-pillar/ops_mit_deploy_risks_auto_testing_and_rollback.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework OPS06-BP04 Automate testing and rollback&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/operational-excellence-pillar/ops_dev_integ_auto_integ_deploy.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework OPS05-BP09 Make frequent, small, reversible changes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://12factor.net/processes%20" rel="noopener noreferrer"&gt;https://12factor.net/processes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://netflixtechblog.com/ami-creation-with-aminator-98d627ca37b0%20" rel="noopener noreferrer"&gt;https://netflixtechblog.com/ami-creation-with-aminator-98d627ca37b0&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/imagebuilder/latest/userguide/what-is-image-builder.html" rel="noopener noreferrer"&gt;AWS EC2 Image Builder User Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2024/08/09/move-fast-and-avoid-surprises-increase-system-reliability-with-immutable-infrastructure/" rel="noopener noreferrer"&gt;Increase system reliability with Immutable Infrastructure – Move fast and avoid surprises&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>immutability</category>
      <category>wellarchitected</category>
      <category>bastion</category>
    </item>
    <item>
      <title>Bye bye Bastion!</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Wed, 03 Jul 2024 17:27:38 +0000</pubDate>
      <link>https://forem.com/haakoned/bye-bye-bastion-3dj2</link>
      <guid>https://forem.com/haakoned/bye-bye-bastion-3dj2</guid>
      <description>&lt;h5&gt;
  
  
  &lt;strong&gt;Table of contents&lt;/strong&gt;
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;
Exploring alternative workflows

&lt;ul&gt;
&lt;li&gt;Deploy sample infrastructure&lt;/li&gt;
&lt;li&gt;Well-Architected Virtual Private Cloud (VPC)&lt;/li&gt;
&lt;li&gt;RDS Aurora MySQL Multi-AZ cluster&lt;/li&gt;
&lt;li&gt;EC2 Amazon Linux instance and security group&lt;/li&gt;
&lt;li&gt;AWS Cloud9 SSM Managed Instance&lt;/li&gt;
&lt;li&gt;Alternative 1: AWS Systems Manager – Session Manager&lt;/li&gt;
&lt;li&gt;Prerequisites&lt;/li&gt;
&lt;li&gt;Connecting to an EC2 instance in a private subnet from the AWS Console&lt;/li&gt;
&lt;li&gt;Connecting to an EC2 instance in a private subnet from your local workstation with the AWS CLI and AWS IAM Identity Center&lt;/li&gt;
&lt;li&gt;Connecting to an RDS cluster in a private database subnet with local port forwarding&lt;/li&gt;
&lt;li&gt;Alternative 2: AWS CloudShell VPC Environment&lt;/li&gt;
&lt;li&gt;Alternative 3: AWS Cloud9 IDE&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Conclusion and feature comparison&lt;/li&gt;

&lt;li&gt;References&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://aws.amazon.com/solutions/implementations/linux-bastion/" rel="noopener noreferrer"&gt;Bastion Host&lt;/a&gt;, or Jump Host concept, has historically been a traditional pattern for providing system administrators with external access to internal compute resources on distinct networks. An actor connects to a dedicated host in a DMZ by most commonly Secure Shell (SSH) or Remote Desktop Protocol (RDP), and from there gain access to compute resources on internal networks to perform system maintenance, apply patches, check content in a database , update schemas etc.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuctxfx4vy6h1mip2iolb.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuctxfx4vy6h1mip2iolb.jpg" width="800" height="219"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These Bastion Hosts must behighly secured to withstand attacks. Measures include hardened operating systems and server configurations, removing non-required services and libraries, firewalling and audit logging, but the reality is that many instances do not meet the recommended standards, either by lack of knowledge or configuration mistakes. One of the most common attack vectors is simply not locking down the firewall rules/security group rules to only permit access on relevant ports from trusted IP ranges, leaving SSH/RDP open to anyone (0.0.0.0/0) instead of your corporate gateway or VPN.&lt;/p&gt;

&lt;p&gt;As mentioned in my post &lt;a href="https://hedrange.com/2024/05/23/protect-your-webapps-from-malicious-traffic-with-aws-web-application-firewall/" rel="noopener noreferrer"&gt;Protect your webapps from malicious traffic with AWS Web Application Firewall&lt;/a&gt;, _ &lt;strong&gt;7 percent of EC2 instances&lt;/strong&gt; ,  &lt;strong&gt;3 percent of Azure VMs&lt;/strong&gt; , and  &lt;strong&gt;13 percent of Google Cloud VMs&lt;/strong&gt;  are publicly exposed to the internet. Among instances that are publicly exposed, HTTP and HTTPS are the most commonly exposed ports, and are not considered risky in general. After these, SSH and RDP remote access protocols are common._&lt;/p&gt;

&lt;p&gt;On July 2nd 2024 the critical vulnerability &lt;a href="https://www.qualys.com/regresshion-cve-2024-6387/" rel="noopener noreferrer"&gt;CVE-2024-6387, labeled regreSSHion&lt;/a&gt;, was announced, where an unauthenticated remote code execution in OpenSSH’s server (sshd) could grant full root access. With a vulnerability score of 9.8/10 this is one of the most serious bugs in OpenSSH in years.&lt;/p&gt;

&lt;p&gt;The first thing to do is to stop exposing SSH/RDP, and then find alternative, more modern workflows for accessing cloud resources. Even if access is whitelisted from trusted IP address ranges, it’s a bad practice in 2024 to have direct access into production environments.&lt;/p&gt;

&lt;p&gt;As stated in the AWS Well-Architected Framework – Security Pillar:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Use automation to perform deployment, configuration, maintenance, and investigative tasks wherever possible. Consider manual access to compute resources in cases of emergency procedures or in safe (sandbox) environments, when automation is not available.&lt;/em&gt;  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Common anti-patterns&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Interactive access to Amazon EC2 instances with protocols such as SSH or RDP.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Maintaining individual user logins such as &lt;code&gt;/etc/passwd&lt;/code&gt; or Windows local users.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Sharing a password or private key to access an instance among multiple users.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Manually installing software and creating or updating configuration files.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Manually updating or patching software.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Logging into an instance to troubleshoot problems.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Removing the use of Secure Shell (SSH) and Remote Desktop Protocol (RDP) for interactive access reduces the scope of access to your compute resources. This takes away a common path for unauthorized actions.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;cite&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/sec_protect_compute_reduce_manual_management.html" title="" rel="noopener noreferrer"&gt;SEC06-BP03 Reduce manual management and interactive access&lt;/a&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For reference, the &lt;a href="https://docs.aws.amazon.com/securityhub/latest/userguide/cis-aws-foundations-benchmark.html" rel="noopener noreferrer"&gt;CIS AWS Foundations Benchmark&lt;/a&gt; has multiple controls when it comes to detecting public exposure of SSH/RDP. &lt;a href="https://aws.amazon.com/security-hub/" rel="noopener noreferrer"&gt;AWS Security Hub&lt;/a&gt; &lt;a href="https://aws.amazon.com/config/" rel="noopener noreferrer"&gt;AWS Config&lt;/a&gt; and &lt;a href="https://aws.amazon.com/premiumsupport/technology/trusted-advisor/" rel="noopener noreferrer"&gt;AWS Trusted Advisor&lt;/a&gt; can help you here.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/securityhub/latest/userguide/ec2-controls.html#ec2-53" rel="noopener noreferrer"&gt;[EC2.53]&lt;/a&gt; EC2 security groups should not allow ingress from 0.0.0.0/0 to remote server administration ports

&lt;ul&gt;
&lt;li&gt;This control checks whether an Amazon EC2 security group allows ingress from 0.0.0.0/0 to remote server administration ports (ports 22 and 3389). The control fails if the security group allows ingress from 0.0.0.0/0 to port 22 or 3389.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://docs.aws.amazon.com/securityhub/latest/userguide/ec2-controls.html#ec2-13" rel="noopener noreferrer"&gt;[EC2.13]&lt;/a&gt; Security groups should not allow ingress from 0.0.0.0/0 or ::/0 to port 22

&lt;ul&gt;
&lt;li&gt;This control checks whether an Amazon EC2 security group allows ingress from 0.0.0.0/0 or ::/0 to port 22. The control fails if the security group allows ingress from 0.0.0.0/0 or ::/0 to port 22.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://docs.aws.amazon.com/securityhub/latest/userguide/ec2-controls.html#ec2-21" rel="noopener noreferrer"&gt;[EC2.21]&lt;/a&gt; Network ACLs should not allow ingress from 0.0.0.0/0 to port 22 or port 3389

&lt;ul&gt;
&lt;li&gt;This control checks whether a network access control list (network ACL) allows unrestricted access to the default TCP ports for SSH/RDP ingress traffic. The control fails if the network ACL inbound entry allows a source CIDR block of ‘0.0.0.0/0’ or ‘::/0’ for TCP ports 22 or 3389.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Exploring alternative workflows
&lt;/h2&gt;

&lt;p&gt;It’s encouraged to pivot from a classical systems administration approach and substitute interactive access with &lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/what-is-systems-manager.html" rel="noopener noreferrer"&gt;AWS Systems Manager&lt;/a&gt; capabilities.&lt;/p&gt;

&lt;p&gt;Look into how you can automate runbooks and trigger maintenance tasks with &lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-automation.html" rel="noopener noreferrer"&gt;AWS Systems Manager Automation documents&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Don’t perform changes in live systems, deploy EC2 compute resources using the &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/rel_tracking_change_management_immutable_infrastructure.html" rel="noopener noreferrer"&gt;immutable infrastructure pattern&lt;/a&gt;, as for container based workloads. Or, even better, containerize to AWS ECS Fargate/EKS.&lt;/p&gt;

&lt;p&gt;If your use-case dictates interactive access; disable security group ingress rules for port 22/tcp (SSH) or port 3389/tcp (RDP) and leverage AWS SSM – &lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager.html" rel="noopener noreferrer"&gt;Session Manager&lt;/a&gt; agent based access to EC2. You can also configure activity logging to Amazon CloudWatch Logs for a full audit trail. We will explore this workflow in the coming chapters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploy sample infrastructure
&lt;/h3&gt;

&lt;p&gt;For testing the described alternatives I have developed a sample Terraform module which deploys the following resources.&lt;/p&gt;

&lt;h5&gt;
  
  
  Well-Architected Virtual Private Cloud (VPC)
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Public subnets&lt;/li&gt;
&lt;li&gt;Private subnets with VPC endpoints&lt;/li&gt;
&lt;li&gt;Database subnets&lt;/li&gt;
&lt;li&gt;NAT Gateways&lt;/li&gt;
&lt;li&gt;VPC endpoints&lt;/li&gt;
&lt;/ul&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;locals {
  azs = slice(data.aws_availability_zones.available.names, 0, 2)
  aws_account_id = data.aws_caller_identity.current.account_id
}

module "vpc" {
  source = "git::https://github.com/terraform-aws-modules/terraform-aws-vpc.git?ref=25322b6b6be69db6cca7f167d7b0e5327156a595"

  name = var.name_prefix
  cidr = var.vpc_cidr
  azs = local.azs
  private_subnets = [for k, v in local.azs : cidrsubnet(var.vpc_cidr, 8, k)]
  public_subnets = [for k, v in local.azs : cidrsubnet(var.vpc_cidr, 8, k + 4)]
  database_subnets = [for k, v in local.azs : cidrsubnet(var.vpc_cidr, 8, k + 8)]

  create_database_subnet_group = true
  create_database_subnet_route_table = true
  create_database_internet_gateway_route = false

  manage_default_network_acl = true
  manage_default_route_table = true
  manage_default_security_group = true

  enable_dns_hostnames = true
  enable_dns_support = true
  enable_nat_gateway = true
  single_nat_gateway = false
  one_nat_gateway_per_az = true

  enable_flow_log = true
  create_flow_log_cloudwatch_log_group = true
  create_flow_log_cloudwatch_iam_role = true
  flow_log_max_aggregation_interval = 60

  vpc_tags = {
    Name = var.name_prefix
  }
}

module "vpc_endpoints" {
  source = "git::https://github.com/terraform-aws-modules/terraform-aws-vpc.git//modules/vpc-endpoints?ref=4a2809c673afa13097af98c2e3c553da8db766a9"

  vpc_id = module.vpc.vpc_id

  create_security_group = true
  security_group_name_prefix = "${var.name_prefix}-vpc-endpoints-"
  security_group_description = "VPC endpoint security group"
  security_group_rules = {
    ingress_https = {
      description = "HTTPS from VPC"
      cidr_blocks = [module.vpc.vpc_cidr_block]
    }
  }

  endpoints = {
    dynamodb = {
      service = "dynamodb"
      service_type = "Gateway"
      route_table_ids = flatten([module.vpc.intra_route_table_ids, module.vpc.private_route_table_ids, module.vpc.public_route_table_ids])
      policy = data.aws_iam_policy_document.dynamodb_endpoint_policy.json
      tags = { Name = "dynamodb-vpc-endpoint" }
    },
    ecs = {
      service = "ecs"
      private_dns_enabled = true
      subnet_ids = module.vpc.private_subnets
    },
    ecs_telemetry = {
      create = false
      service = "ecs-telemetry"
      private_dns_enabled = true
      subnet_ids = module.vpc.private_subnets
    },
    ecr_api = {
      service = "ecr.api"
      private_dns_enabled = true
      subnet_ids = module.vpc.private_subnets
      policy = data.aws_iam_policy_document.generic_endpoint_policy.json
    },
    ecr_dkr = {
      service = "ecr.dkr"
      private_dns_enabled = true
      subnet_ids = module.vpc.private_subnets
      policy = data.aws_iam_policy_document.generic_endpoint_policy.json
    },
    rds = {
      service = "rds"
      private_dns_enabled = true
      subnet_ids = module.vpc.private_subnets
      security_group_ids = [module.db.security_group_id]
    },
    kms = {
      service = "kms"
      private_dns_enabled = true
      subnet_ids = module.vpc.database_subnets
    },
    ssm = {
      service = "ssm"
      private_dns_enabled = true
      subnet_ids = module.vpc.private_subnets
    },
    ssmmessages = {
      service = "ssmmessages"
      private_dns_enabled = true
      subnet_ids = module.vpc.private_subnets
    },
    ec2messages = {
      service = "ec2messages"
      private_dns_enabled = true
      subnet_ids = module.vpc.private_subnets
    }
  }

  tags = {
    Name = var.name_prefix
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  RDS Aurora MySQL Multi-AZ cluster
&lt;/h5&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;module "db" {
  source = "git::https://github.com/terraform-aws-modules/terraform-aws-rds-aurora.git?ref=7d46e900b31322fd7a0ab0d7f67006ba4836c995"

  name = "${var.name_prefix}-rds"
  engine = "aurora-mysql"
  engine_version = "8.0"
  master_username = "root"
  instances = {
    1 = {
      instance_class = "db.t3.medium"
    }
    2 = {
      instance_class = "db.t3.medium"
    }
  }
  vpc_id = module.vpc.vpc_id
  db_subnet_group_name = module.vpc.database_subnet_group_name
  security_group_rules = {
    ingress = {
      source_security_group_id = aws_security_group.private_access.id
    }
    ingress = {
      source_security_group_id = data.aws_security_group.cloud9_security_group.id
    }
    kms_vpc_endpoint = {
      type = "egress"
      from_port = 443
      to_port = 443
      source_security_group_id = module.vpc_endpoints.security_group_id
    }
  }

  tags = {
    Name = var.name_prefix
    Environment = "dev"
    Classification = "internal"
  }

  manage_master_user_password_rotation = true
  master_user_password_rotation_schedule_expression = "rate(7 days)"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  EC2 Amazon Linux instance and security group
&lt;/h5&gt;

&lt;p&gt;The Security Group for RDS is provisioned within the module. One placeholder security group labeled “private_access” is defined for EC2 and CloudShell purposes which only permits egress traffic. It is referenced in the RDS cluster security group to permit incoming connections on port 3306 for MySQL. This is called &lt;a href="https://docs.aws.amazon.com/vpc/latest/userguide/security-group-rules.html#security-group-referencing" rel="noopener noreferrer"&gt;security group referencing&lt;/a&gt; and allows for dynamic configurations, instead of specifying static CIDR ranges, which often are too permissive.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fky7530bxir8v5a2f56y8.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fky7530bxir8v5a2f56y8.jpg" width="800" height="676"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data "aws_ami" "amazon_linux_23" {
  most_recent = true
  owners = ["amazon"]

  filter {
    name = "name"
    values = ["al2023-ami-2023*-x86_64"]
  }
}

module "ec2_instance" {
  source = "git::https://github.com/terraform-aws-modules/terraform-aws-ec2-instance.git?ref=4f8387d0925510a83ee3cb88c541beb77ce4bad6"

  name = "${var.name_prefix}-ec2"
  ami = data.aws_ami.amazon_linux_23.id
  create_iam_instance_profile = true
  iam_role_description = "IAM role for EC2 instance and SSM access"
  iam_role_policies = {
    AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
  }
  instance_type = "t2.micro"
  vpc_security_group_ids = [aws_security_group.private_access.id]
  subnet_id = element(module.vpc.private_subnets, 0)

  # Enforces IMDSv2
  metadata_options = {
    "http_endpoint" : "enabled",
    "http_put_response_hop_limit" : 1,
    "http_tokens" : "required"
  }

  tags = {
    Name = "${var.name_prefix}-ec2"
    Environment = "dev"
  }
}

resource "aws_security_group" "private_access" {
  #checkov:skip=CKV2_AWS_5: Placeholder security group, to be assigned to applicable resources, but beyond scope of this module.
  name_prefix = "${var.name_prefix}-private-access"
  description = "Security group for private access from local resources. Permits egress traffic."
  vpc_id = module.vpc.vpc_id

  egress {
    description = "Permit egress TCP"
    from_port = 0
    to_port = 65535
    protocol = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  egress {
    description = "Permit egress UDP"
    from_port = 0
    to_port = 65535
    protocol = "udp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  egress {
    description = "Permit egress ICMP"
    from_port = -1
    to_port = -1
    protocol = "icmp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "${var.name_prefix}-sg-private-access"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  AWS Cloud9 SSM Managed Instance
&lt;/h5&gt;

&lt;p&gt;The data property for the security group makes it possible to identity the security group provisioned by AWS Cloud9. This is added in an ingress rule for the RDS cluster, as previously defined.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_cloud9_environment_ec2" "cloud9_ssm_instance" {
  name = "${var.name_prefix}-cloud9"
  instance_type = "t2.micro"
  automatic_stop_time_minutes = 30
  image_id = "amazonlinux-2023-x86_64"
  connection_type = "CONNECT_SSM"
  subnet_id = element(module.vpc.private_subnets, 0)
  owner_arn = length(var.cloud9_instance_owner_arn) &amp;gt; 0 ? var.cloud9_instance_owner_arn : null
}

data "aws_security_group" "cloud9_security_group" {
  filter {
    name = "tag:aws:cloud9:environment"
    values = [
      aws_cloud9_environment_ec2.cloud9_ssm_instance.id
    ]
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a fully working code repository along with setup instructions, see &lt;a href="https://github.com/haakond/terraform-aws-bastion-host-alternatives/blob/main/examples/main.tf" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-bastion-host-alternatives/blob/main/examples/main.tf&lt;/a&gt; and &lt;a href="https://github.com/haakond/terraform-aws-bastion-host-alternatives/blob/main/README.md" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-bastion-host-alternatives/blob/main/README.md&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Alternative 1: AWS Systems Manager – Session Manager
&lt;/h3&gt;

&lt;p&gt;With &lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager.html" rel="noopener noreferrer"&gt;AWS Systems Manager – Session Manager&lt;/a&gt;, you can manage your Amazon Elastic Compute Cloud (Amazon EC2) instances, edge devices, on-premises servers, and virtual machines (VMs). Port forwarding is also supported to connect to remote hosts in private subnets.&lt;/p&gt;

&lt;p&gt;You can use either an interactive one-click browser-based shell or the AWS Command Line Interface (AWS CLI). Session Manager provides secure and auditable node management without the need to open inbound ports, maintain bastion hosts, or manage SSH keys. Session Manager supports Linux, Windows and macOS and session activity can be logged with AWS CloudTrail and Amazon CloudWatch Logs.&lt;/p&gt;

&lt;p&gt;Session Manager can be configured to &lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-logging.html" rel="noopener noreferrer"&gt;log entered commands and their output during a session&lt;/a&gt;, which can be used for generating reports or audit situations.&lt;/p&gt;

&lt;p&gt;Note: AWS Systems Manager also provides an option with &lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/connect-with-ec2-instance-connect-endpoint.html" rel="noopener noreferrer"&gt;EC2 Instance Connect Endpoints&lt;/a&gt;, but this is based on SSH.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22dlc3hpqa661hscdoau.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F22dlc3hpqa661hscdoau.jpg" width="800" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The coming two examples are based on this access pattern.&lt;/p&gt;

&lt;h4&gt;
  
  
  Prerequisites
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/operating-systems-and-machine-types.html" rel="noopener noreferrer"&gt;Supported operating system&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html" rel="noopener noreferrer"&gt;AWS Systems Manager SSM agent installed&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/ami-preinstalled-agent.html" rel="noopener noreferrer"&gt;List of AMIs with the SSM Agent preinstalled&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-privatelink.html" rel="noopener noreferrer"&gt;Connectivity to endpoints&lt;/a&gt; ec2messages, ssm and ssmmessages in the current region&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-instance-profile.html" rel="noopener noreferrer"&gt;IAM service role permissions&lt;/a&gt; AmazonSSMManagedInstanceCore or equivalent&lt;/li&gt;

&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html" rel="noopener noreferrer"&gt;Optional: Install the Session Manager plugin for the AWS CLI&lt;/a&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  Connecting to an EC2 instance in a private subnet from the AWS Console
&lt;/h4&gt;

&lt;p&gt;There are two optional starting points in the AWS Console, either from the AWS Systems Manager – Session Manager or directly from the EC2 instances list. The EC2 approach is usually the fastest and most convenient. With the prerequisites in order, find the instance you want to connect to, hit Connect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyu2o8q22e64l0azsg2m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyu2o8q22e64l0azsg2m.png" width="800" height="123"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ensure the tab with the option Session Manager is chosen and click Connect again.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ivnp0tdtijw6f90jrwn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ivnp0tdtijw6f90jrwn.png" width="800" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I am now logged in and authenticated with my Federated AWS IAM Identity Center method and we have full traceability.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu7ly1rymvoarl4pulr31.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu7ly1rymvoarl4pulr31.png" width="800" height="306"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Connecting to an EC2 instance in a private subnet from your local workstation with the AWS CLI and AWS IAM Identity Center
&lt;/h4&gt;

&lt;p&gt;In this example Microsoft Entra ID is the identity provider and &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers.html" rel="noopener noreferrer"&gt;federation&lt;/a&gt;is configured with AWS IAM Identity Center for modern user management. Do yourself a favor and get rid of those IAM users with static access keys.&lt;/p&gt;

&lt;p&gt;To authorize your workstation based on Linux, macOS or Windows Subsystem for Linux, ensure you have the &lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html" rel="noopener noreferrer"&gt;latest version of the AWS CLI installed&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;aws configure sso&lt;/code&gt; and follow the instructions to obtain a valid session based on a browser where you’re currently logged in. For a full step by step guide see &lt;a href="https://docs.aws.amazon.com/cli/latest/userguide/sso-configure-profile-token.html#sso-configure-profile-token-auto-sso" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/cli/latest/userguide/sso-configure-profile-token.html#sso-configure-profile-token-auto-sso&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You should now be logged in and have chosen the relevant AWS account and role to assume.&lt;/p&gt;

&lt;p&gt;To confirm I run the following AWS cli command to get a list of EC2 instances including the Name tag value. One instance is returned.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws ec2 describe-instances --query 'Reservations[].Instances[].[InstanceId, Tags[?Key==Name].Value[]]' --output=json

[
    [
        [
            "i-0b448a908727eec7d",
            [
                "bastion-alternative-demo-ec2"
            ]
        ]
    ]
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Connecting to the instance is as simple as:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws ssm start-session --target i-0b448a908727eec7d
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3qh9tjngmspttmb5j3hb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3qh9tjngmspttmb5j3hb.png" width="800" height="323"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Type &lt;code&gt;exit&lt;/code&gt;to terminate the session. There you go, CLI or Console based access with a security group with no inbound rules.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1jjvuxw1whuw6oe9ontc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1jjvuxw1whuw6oe9ontc.png" width="800" height="275"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Connecting to an RDS cluster in a private database subnet with local port forwarding
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7cfrgbapmt7vu9631alw.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7cfrgbapmt7vu9631alw.jpg" width="800" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As with the previous example, establish a session with &lt;code&gt;aws configure sso&lt;/code&gt; and identify the EC2 instance you would like to use as proxy. In this example we use the same one. To find the RDS cluster writer endpoint name you can issue the following command:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws rds describe-db-clusters \
  --query 'DBClusters[?starts_with(DBClusterIdentifier, `bastion-alternative-demo`)].DBClusterIdentifier' \
  --output text | xargs -I {} aws rds describe-db-cluster-endpoints \
  --db-cluster-identifier {} \
  --query 'DBClusterEndpoints[?EndpointType==`WRITER`].Endpoint' \
  --output text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternatively, in the AWS Console navigate to RDS and copy the Writer Endpoint name, FQDN.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgn9bpqxlqhju2214l2n9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgn9bpqxlqhju2214l2n9.png" width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Execute the following command to start a port forwarding session which should provide MySQL connectivity to port 3306 on your local machine:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws ssm start-session --target i-0b448a908727eec7d --document-name AWS-StartPortForwardingSessionToRemoteHost --parameters '{"portNumber":["3306"],"localPortNumber":["3306"],"host":["bastion-alternative-demo-rds.cluster-cmikipz5ncly.eu-west-1.rds.amazonaws.com"]}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The main difference is that this time the &lt;code&gt;aws ssm start-session&lt;/code&gt; command will trigger an AWS Systems Managed Document called “AWS-StartPortForwardingSessionToRemoteHost” and we supply desired parameters. For demonstration the Terraform module has provisioned an RDS Aurora MySQL cluster in private database subnets.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffvxvajxr31iqm7hf2x7s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffvxvajxr31iqm7hf2x7s.png" width="800" height="117"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The command outputs “Starting session with SessionId (..) Port 3306 opened, Waiting for connections.&lt;/p&gt;

&lt;p&gt;In another terminal I run &lt;code&gt;mysql -y root -p -h 127.0.0.1 --port 3306&lt;/code&gt; and take the opportunity to create a new privileged mysql user called chuck_norris.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1yl5a58o4b8t66er8jen.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1yl5a58o4b8t66er8jen.png" width="800" height="486"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5q06dtvdnp8bd7htvio5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5q06dtvdnp8bd7htvio5.png" width="558" height="461"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Verified working as expected.&lt;/p&gt;

&lt;p&gt;If you prefer to use a GUI client like &lt;a href="https://www.mysql.com/products/workbench/" rel="noopener noreferrer"&gt;MySQL Workbench&lt;/a&gt; or &lt;a href="https://www.heidisql.com/#google_vignette" rel="noopener noreferrer"&gt;HeidiSQL&lt;/a&gt;configure it to connect on localhost port 3306. If you run a development database server locally you probably would like to configure port forwarding to a different port.&lt;/p&gt;

&lt;h3&gt;
  
  
  Alternative 2: AWS CloudShell VPC Environment
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymj9ncitz3k8ebdarlij.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymj9ncitz3k8ebdarlij.jpg" width="800" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AWS CloudShell is a browser-based shell that is pre-authenticated with your console credentials which makes it easy to securely manage, explore and interact with AWS resources. Common development tools related to AWS are also pre-installed.&lt;/p&gt;

&lt;p&gt;AWS &lt;a href="https://aws.amazon.com/about-aws/whats-new/2024/06/aws-cloudshell-amazon-virtual-private-cloud/" rel="noopener noreferrer"&gt;announced&lt;/a&gt;VPC Environment support for AWS CloudShell on June 26th 2024. This enables the possibility to use CloudShell securely within the same subnet as other resources in your VPCs without the need for additional network configuration. Before this there was no means to control the network flow.&lt;/p&gt;

&lt;p&gt;One caveat is that AWS CloudShell VPC environment does not support persistent storage, as the regular CloudShell feature has. Storage is ephemeral and data and home directories are deleted when an active environment ends, so you have to ensure data you care about is saved in Amazon S3 or another relevant persistent store. In my opinion, from a security perspective, auto-cleanup is a positive thing.&lt;/p&gt;

&lt;p&gt;Open Cloudshell from the main AWS Services search box or the logo icon:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0o5mpg2owuszxca8skfj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0o5mpg2owuszxca8skfj.png" width="800" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We choose the pre-provisioned VPC, subnet and security group:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy65k278dik1rhwn4tevz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy65k278dik1rhwn4tevz.png" width="522" height="793"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s verify outbound connectivity:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6t0uj27vkwpgmvgrj8w7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6t0uj27vkwpgmvgrj8w7.png" width="800" height="312"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx17epgmy5j44lkgdcpio.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx17epgmy5j44lkgdcpio.png" width="460" height="127"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The /home partition has about 12GB free space. If you need more scratch space, look into mounting &lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEFS.html" rel="noopener noreferrer"&gt;Amazon Elastic File System&lt;/a&gt; or dump stuff on &lt;a href="https://aws.amazon.com/s3/" rel="noopener noreferrer"&gt;Amazon S3&lt;/a&gt;. Keep in mind that the CloudShell volume is ephemeral and data will be gone after your session ends.&lt;/p&gt;

&lt;p&gt;Let’s also check that Cloudshell’s Elastic Network Interface is in the expected private subnet. Thank you &lt;a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/q-in-IDE-setup.html" rel="noopener noreferrer"&gt;Amazon Q&lt;/a&gt; for the creative query suggestion.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws ec2 describe-network-interfaces \
  --filters "Name=private-ip-address,Values=$(ip addr show ens6 | grep -oE '((1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])' | head -n 1)" \
  --query 'NetworkInterfaces[*].{SubnetId:SubnetId}' \
  --output text | xargs -I {} aws ec2 describe-subnets --subnet-ids {} \
  --query 'Subnets[*].{SubnetId:SubnetId,Ipv4CidrBlock:CidrBlock,TagName:Tags[?Key==`Name`].Value|[0]}' --output table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjengc0gzcybhfhd2f89f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjengc0gzcybhfhd2f89f.png" width="800" height="125"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Cloudshell has many development tools pre-installed, including a mysql client. To verify that the endpoint hostname resolves to a private ip address in the private subnet, install dig from bind-utils:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo yum install bind-utils -y
dig +short &amp;lt;hostname&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffehpblnv7dnsewtio4em.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffehpblnv7dnsewtio4em.png" width="800" height="41"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fheviw11gc6akc4pke5n5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fheviw11gc6akc4pke5n5.png" width="800" height="328"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We successfully managed to connect to the RDS Aurora MySQL cluster and created a new database.&lt;/p&gt;

&lt;h3&gt;
  
  
  Alternative 3: AWS Cloud9 IDE
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pyhkb1cr5xz0jxyvhx6.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pyhkb1cr5xz0jxyvhx6.jpg" width="800" height="514"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/cloud9/latest/user-guide/welcome.html" rel="noopener noreferrer"&gt;AWS Cloud9&lt;/a&gt; is a cloud based integrated development environment (IDE) that can be accessed through the AWS Console. It could be an alternative if you prefer a lightweight client/workstation environment, where your development environment will be the configured the same across client devices.&lt;/p&gt;

&lt;p&gt;Cloud9 is a fully managed service based on EC2 and EBS for persistent data storage. Instance hibernation takes place after a period if inactivity (configurable, from 30 minutes to x hours) to save costs when not in use. A Cloud9 &lt;a href="https://docs.aws.amazon.com/cloud9/latest/user-guide/environments.html" rel="noopener noreferrer"&gt;environment&lt;/a&gt;can be deployed into both public and private subnets, in modes &lt;a href="https://docs.aws.amazon.com/cloud9/latest/user-guide/create-environment-main.html" rel="noopener noreferrer"&gt;EC2&lt;/a&gt;(recommended) or &lt;a href="https://docs.aws.amazon.com/cloud9/latest/user-guide/create-environment-ssh.html" rel="noopener noreferrer"&gt;SSH&lt;/a&gt;(discouraged). The EC2 mode supports the “&lt;a href="https://docs.aws.amazon.com/cloud9/latest/user-guide/ec2-ssm.html" rel="noopener noreferrer"&gt;no-ingress instance&lt;/a&gt;” pattern, without the need to open any inbound ports, by leveraging AWS Systems Manager. This is the procedure we will explore further.&lt;/p&gt;

&lt;p&gt;A sample Cloud9 instance has been provisioned as part of the Terraform sample module. Navigate to AWS Cloud9 in the AWS Console, locate “bastion-alternative-demo-cloud9” and click Open.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffy0ql6xp81darx1djjmd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffy0ql6xp81darx1djjmd.png" width="800" height="181"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The EC2 environment type comes with &lt;a href="https://docs.aws.amazon.com/cloud9/latest/user-guide/security-iam.html#auth-and-access-control-temporary-managed-credentials" rel="noopener noreferrer"&gt;AWS managed temporary credentials&lt;/a&gt; activated by default, which manages the AWS access credentials on the users behalf, with certain restrictions. To ensure you get all privileges available to the IAM policies for your active role session, disable this in the AWS Cloud9 main window, Preferences, AWS Settings, Credentials.&lt;/p&gt;

&lt;p&gt;Open a Terminal tab and issue &lt;code&gt;aws configure sso&lt;/code&gt; as in the previous example, and set the SSO session name “default”.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pwm2mj9xdhh9vwpdarr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pwm2mj9xdhh9vwpdarr.png" width="800" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;aws s3 ls&lt;/code&gt; verifies the AWS session. Technically, this could have been executed from “anywhere”, but the main benefit are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Cloud9 environment maintains the configuration regardless of client device. &lt;/li&gt;
&lt;li&gt;Other resources on private subnets may be configured to be available.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verify database connectivity:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo0ty73b2hbhf2eaxmqgm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo0ty73b2hbhf2eaxmqgm.png" width="800" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion and feature comparison
&lt;/h2&gt;

&lt;p&gt;In this blog post we explored alternative workflows which can replace the concept of a traditional Bastion Host accessed by SSH or RDP. AWS provides customers with alternatives so that you can choose the one that best fits your use-case.&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Alternative&lt;/strong&gt; | &lt;strong&gt;Good&lt;/strong&gt; | &lt;strong&gt;Not so good&lt;/strong&gt; | &lt;strong&gt;Pricing&lt;/strong&gt; |&lt;br&gt;
| &lt;strong&gt;AWS Systems Manager – Session Manager&lt;/strong&gt; | Provides easy access to any compute resource from the AWS Console or the AWS CLI with federation/AWS IAM Identity Center support.  &lt;/p&gt;

&lt;p&gt;Supports port forwarding and local GUI clients.  &lt;/p&gt;

&lt;p&gt;Full integration with Cloudtrail and CloudWatch Logs for auditing and activity tracking | All prerequisites for SSM Managed Instances may be seen as a hurdle, but can be solved with IaC. | No additional charges for accessing Amazon EC2 instances.  &lt;/p&gt;

&lt;p&gt;The advanced on-premises instance tier is required for using Session Manager to interactively access on-premises instances.  |&lt;br&gt;
| &lt;strong&gt;AWS CloudShell VPC Environments&lt;/strong&gt; | Available anywhere in the AWS Console.  &lt;/p&gt;

&lt;p&gt;Does not require extensive configuration.  &lt;/p&gt;

&lt;p&gt;Easy to use for quick commands or lookups.  &lt;/p&gt;

&lt;p&gt;Ephemeral storage, auto-cleanup after session inactivity.&lt;br&gt;&lt;br&gt;
 | Ephemeral storage, auto-cleanup after session inactivity.  &lt;/p&gt;

&lt;p&gt;Due to possible session timeout issues consider other options for long running commands, database imports/exports etc.  &lt;/p&gt;

&lt;p&gt;Audit and activity logging capabilities not matching SSM Session Manager. | No additional charges, minimum fees nor commitments. You only pay for other AWS resources you use with CloudShell to create and run your applications. |&lt;br&gt;
| &lt;strong&gt;AWS Cloud9&lt;/strong&gt; | Appealing if you’re also working on code development (application/IaC).&lt;br&gt;&lt;br&gt;
Same IDE experience regardless of client device.  &lt;/p&gt;

&lt;p&gt;Terraform/Cloudformation deployments can be triggered triggered from the same terminal.  &lt;/p&gt;

&lt;p&gt;Data is persisted on EBS until environment termination. | Session/permission management can be cumbersome.  &lt;/p&gt;

&lt;p&gt;Audit and activity logging capabilities not matching SSM Session Manager. | No additional charges, minimum fees nor commitments for the service itself. You pay only for the compute (EC2) and storage resources (EBS) for the environment.   &lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;&lt;br&gt;
t2.micro Linux instance at $0.0116/hour x 90 total hours used per month = $1.05&lt;br&gt;&lt;br&gt;
$0.10 per GB-month of provisioned storage x 10-GB storage volume = $1.00  &lt;/p&gt;

&lt;p&gt;Total monthly fees: $2.05 |&lt;/p&gt;

&lt;p&gt;My personal preference is to pivot to immutability with containers and automation, but AWS Systems Manager – Session Manager would be the most viable alternative workflow for any remaining EC2 based workloads.&lt;/p&gt;

&lt;p&gt;Instead of querying the database directly for troubleshooting or support requests, &lt;a href="https://docs.aws.amazon.com/wellarchitected/2023-10-03/framework/sec_protect_data_rest_use_people_away.html" rel="noopener noreferrer"&gt;develop a support microservice or dashboard for the purpose&lt;/a&gt;. It could be as simple as SELECT * from relevant tables which returns relevant data in JSON format, with standard employee authentication and authorization mechanisms built in.&lt;/p&gt;

&lt;p&gt;Full Terraform sample code is available at &lt;a href="https://github.com/haakond/terraform-aws-bastion-host-alternatives" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-bastion-host-alternatives&lt;/a&gt;, feel free to grab anything you need.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/2023-10-03/framework/sec_protect_data_rest_use_people_away.html" rel="noopener noreferrer"&gt;SEC08-BP05 Use mechanisms to keep people away from data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager.html" rel="noopener noreferrer"&gt;AWS Systems Manager Session Manager&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/patch-manager.html" rel="noopener noreferrer"&gt;AWS Systems Manager Patch Manager&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/run-command.html" rel="noopener noreferrer"&gt;AWS Systems Manager Run Command&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with.html" rel="noopener noreferrer"&gt;Working with Session Manager&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/cloudshell/latest/userguide/using-cshell-in-vpc.html" rel="noopener noreferrer"&gt;AWS CloudShell – Using AWS CloudShell in Amazon VPC&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/cloud9/latest/user-guide/welcome.html" rel="noopener noreferrer"&gt;AWS Cloud9 User Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://martinfowler.com/bliki/ImmutableServer.html" rel="noopener noreferrer"&gt;Martin Fowler – Immutable server&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2024/07/03/bye-bye-bastion/" rel="noopener noreferrer"&gt;Bye bye Bastion!&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>bastion</category>
      <category>security</category>
      <category>aws</category>
    </item>
    <item>
      <title>Never miss an alert with AWS Chatbot and AWS SSM – Incident Manager</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Fri, 31 May 2024 07:40:23 +0000</pubDate>
      <link>https://forem.com/haakoned/never-miss-an-alert-with-aws-chatbot-and-aws-ssm-incident-manager-46da</link>
      <guid>https://forem.com/haakoned/never-miss-an-alert-with-aws-chatbot-and-aws-ssm-incident-manager-46da</guid>
      <description>&lt;p&gt;&lt;strong&gt;Table of contents&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;
AWS Chatbot

&lt;ul&gt;
&lt;li&gt;AWS Chatbot Pricing&lt;/li&gt;
&lt;li&gt;Setting up AWS Chatbot for Slack&lt;/li&gt;
&lt;li&gt;Step 1: Add the AWS Chatbot app in Slack Automations to your desired Slack workspace&lt;/li&gt;
&lt;li&gt;
Step 2: Configure a Slack channel by inviting the AWS Chatbot app to your desired Slack channel

&lt;ul&gt;
&lt;li&gt;Manual procedure&lt;/li&gt;
&lt;li&gt;Automated deployment with Terraform&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Step 3: Test notifications&lt;/li&gt;

&lt;li&gt;Sending CloudWatch alarms to Slack&lt;/li&gt;

&lt;li&gt;Sending AWS Health notifications and Security Hub findings to Slack&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

AWS Systems Manager – Incident Manager

&lt;ul&gt;
&lt;li&gt;Response plans and incident severity classification&lt;/li&gt;
&lt;li&gt;Notification deduplication&lt;/li&gt;
&lt;li&gt;Setting up AWS Systems Manager – Incident Manager&lt;/li&gt;
&lt;li&gt;Configure AWS Systems Manager for your applicable regions with region failover replication set&lt;/li&gt;
&lt;li&gt;Configure contacts for your on-call team&lt;/li&gt;
&lt;li&gt;Configure response plan – Example for type Critical Incident Process&lt;/li&gt;
&lt;li&gt;Configure on-call schedule&lt;/li&gt;
&lt;li&gt;Configure Systems Manager Runbook to be executed for the Critical Incident Process&lt;/li&gt;
&lt;li&gt;Deploying AWS Systems Manager – Incident Manager with Terraform&lt;/li&gt;
&lt;li&gt;Using AWS SSM Incident Manager&lt;/li&gt;
&lt;li&gt;Activate contact channels&lt;/li&gt;
&lt;li&gt;On-call schedule – calendar overview&lt;/li&gt;
&lt;li&gt;Fire drill&lt;/li&gt;
&lt;li&gt;Incident Manager Pricing&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Conclusion&lt;/li&gt;

&lt;li&gt;References&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Having a clear understanding of operational service level indicators like service latency and availability is paramount to ensure you can deliver the expected quality of service to your end users, customers and your company. By expected I mean exactly that. Not less, not high above, but mainly at point.&lt;/p&gt;

&lt;p&gt;Lower quality of service can manifest into unhappy customers and end users. Higher quality of service can be seen as a good thing, but a too high level can lead to over-engineering, increased complexity, higher cost of over-provisioned capacity and not daring to innovate. Identifying the right level based on conversations with customer stakeholders and aligning expectations through Service Level Agreements and &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-ServiceLevelObjectives.html#CloudWatch-ServiceLevelObjectives-concepts" rel="noopener noreferrer"&gt;Service Level Objectives&lt;/a&gt; can ensure all involved parties share the same understanding of where the bar is. If a stakeholder says “the website must be up and running at all times” people can have different understanding of what this means in practice. Is the stakeholder meaning 99.99%? Or even 100%? Or is 43 minutes over the course of 30 days acceptable? This can lead to very interesting conversations based on operational metrics data instead of subjective opinion.&lt;/p&gt;

&lt;p&gt;The first pillar in the AWS Well-Architected Framework is called Operational Excellence, and these are a few of the best practice recommendations that can get you very far, in addition to the Reliability Pillar.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_observability_identify_kpis.html" rel="noopener noreferrer"&gt;OPS04-BP01 Identify key performance indicators&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;Service Level Indicators such as availability and/or latency&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_observability_application_telemetry.html" rel="noopener noreferrer"&gt;OPS04-BP02 Implement application telemetry&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_observability_customer_telemetry.html" rel="noopener noreferrer"&gt;OPS04-BP03 Implement user experience telemetry&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_event_response_event_incident_problem_process.html" rel="noopener noreferrer"&gt;OPS10-BP01 Use a process for event, incident, and problem management&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_event_response_prioritize_events.html" rel="noopener noreferrer"&gt;OPS10-BP03 Prioritize operational events based on business impact&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_workload_observability_create_alerts.html" rel="noopener noreferrer"&gt;OPS08-BP04 Create actionable alerts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_event_response_define_escalation_paths.html" rel="noopener noreferrer"&gt;OPS10-BP04 Define escalation paths&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_event_response_dashboards.html" rel="noopener noreferrer"&gt;OPS10-BP06 Communicate status through dashboards&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/rel-13.html" rel="noopener noreferrer"&gt;REL 13. How do you plan for disaster recovery (DR)?&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Consider that you have deployed a set of workloads, defined &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_observability_identify_kpis.html" rel="noopener noreferrer"&gt;KPIs&lt;/a&gt;for five most important end user workflows in the systems, have CloudWatch alarms configured and &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_ready_to_support_use_playbooks.html" rel="noopener noreferrer"&gt;playbooks&lt;/a&gt;defined in an &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/ops_evolve_ops_knowledge_management.html" rel="noopener noreferrer"&gt;operational wiki&lt;/a&gt; about how to handle the events.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What is your alerting strategy? Many start with CloudWatch Alarms =&amp;gt; Simple Notification Service (SNS) =&amp;gt; email/Slack, but what happens if multiple alarms have triggered? How do you ensure the right personnel gets notified so that critical alerts aren’t missed, and, know which one(s) to prioritize first?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In this blog post we will explore one possible solution which leverages AWS Systems Manager – Incident Manager to efficiently manage situations where a workload has become unavailable or is severely impacted. We will also look into how AWS Chatbot can increase insight and visibility into operational metrics and events by getting data out of the AWS Console and into Slack with examples for events from AWS Health, AWS Security Hub and CloudWatch alarms for a container based workload. I will also demonstrate how you can provision the solution with Terraform.&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Chatbot
&lt;/h2&gt;

&lt;p&gt;This is a managed service from AWS which &lt;em&gt;enables ChatOps for AWS&lt;/em&gt;. Operational tasks and visibility can be shifted from the AWS Console to Amazon Chime, Microsoft Teams and Slack. You can receive notifications for operational alarms, security alarms, budget deviations &lt;a href="https://docs.aws.amazon.com/chatbot/latest/adminguide/related-services.html" rel="noopener noreferrer"&gt;and so on&lt;/a&gt;. The service eliminates the need for self-managed AWS Lambda functions for these types of integrations, and if your organization is using Slack or Microsoft Teams I can highly recommend to check if you can replace any custom integration logic with AWS Chatbot.&lt;/p&gt;

&lt;p&gt;Another useful aspect of AWS Chatbot is the possibility to search and discover AWS information and ask service questions to Amazon Q, without needing to investigate official documentation sources or search on the internet. The answers will be visible to your team so that everyone is kept in the loop. You can also ask Q in VScode if you have a topic you prefer to keep to yourself.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Chatbot Pricing
&lt;/h3&gt;

&lt;p&gt;AWS Chatbot is &lt;a href="https://aws.amazon.com/chatbot/pricing/" rel="noopener noreferrer"&gt;free to use&lt;/a&gt;. You only pay for underlying services (SNS, CloudWatch, GuardDuty, EventBridge, Security Hub) like how you would call them using CLI/Console etc. Slack/Microsoft Teams licensing options will apply as relevant.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting up AWS Chatbot for Slack
&lt;/h3&gt;

&lt;p&gt;AWS Chatbot uses Amazon Simple Notification Service (SNS) topic to send event and alarm notifications from AWS services to chat channels you configure. The initial configuration of the service, by authorizing Slack, is only possible through the AWS Console, but the remainder of the solution is possible to provision by using a combination of Terraform providers for AWS. I will take you through the relevant code snippets. Full sample code is available on my GitHub page and further explained in the Conclusion section.&lt;/p&gt;

&lt;p&gt;Different organizations have different operational models based on if they are doing centralized platform services or more distributed DevOps style.&lt;/p&gt;

&lt;p&gt;If you prefer to centralize the configuration you can do so in one AWS account and have other workload accounts publish events to the central SNS topic. After some experience with AWS Chatbot, and recent resource support in the AWS Cloud Control Provider for Terraform, I’ve landed on that deploying AWS Chatbot in each production workload account can be automated to a high degree and it also reduces complexity for resource access across accounts. One example is that AWS Chatbot automatically can include CloudWatch metrics graphs and useful information.&lt;/p&gt;

&lt;p&gt;Each DevOps team can then take full responsibility for their workloads by sending CloudWatch Alarms and EventBridge events from services like AWS Health, AWS Security Hub and Amazon GuardDuty to a team specific Slack/Microsoft Teams channel. A centralized platform team could do the same for aggregated insights for Landing Zone governance as a safety net.&lt;/p&gt;

&lt;p&gt;After these Terraform resources have been provisioned, In this case &lt;a href="https://docs.aws.amazon.com/chatbot/latest/adminguide/slack-setup.html" rel="noopener noreferrer"&gt;Slack&lt;/a&gt;will be demonstrated. Take a note of the output sns_topic_for_aws_chatbot_arn. This will be configured in future sources.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 1: Add the AWS Chatbot app in Slack Automations to your desired Slack workspace
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;As a Slack workspace administrator, add AWS Chatbot to the Slack workspace.&lt;/li&gt;
&lt;li&gt;Log in to the AWS Console in your respective workload account&lt;/li&gt;
&lt;li&gt;Go to AWS Chatbot console&lt;/li&gt;
&lt;li&gt;Configure a chat client&lt;/li&gt;
&lt;li&gt;Choose Slack, Configure&lt;/li&gt;
&lt;li&gt;Choose the Slack workspace you prefer to use&lt;/li&gt;
&lt;li&gt;Allow&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxiw26d1keqxnm9ttply.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxiw26d1keqxnm9ttply.png" width="662" height="129"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Official reference: &lt;a href="https://docs.aws.amazon.com/chatbot/latest/adminguide/slack-setup.html#slack-client-setup" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/chatbot/latest/adminguide/slack-setup.html#slack-client-setup&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 2: Configure a Slack channel by inviting the AWS Chatbot app to your desired Slack channel
&lt;/h4&gt;

&lt;h5&gt;
  
  
  Manual procedure
&lt;/h5&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83oscx8gm9oqsanp774u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83oscx8gm9oqsanp774u.png" width="447" height="184"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In my case I call the Configuration name “hed-aws-monitoring” and configure event logging to Amazon CloudWatch Logs to be able to ensure that the setup is working as expected and for possible troubleshooting. AWS Chatbot creates this Amazon CloudWatch Logs Group as part of the provisioning phase in us-east-1, it’s &lt;em&gt;not&lt;/em&gt; possible (as of June 2024) to configure an existing CloudWatch Logs Group you may have already have provisioned.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fifwj92z54alovglhwev0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fifwj92z54alovglhwev0.png" width="800" height="713"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For Role settings I choose to let AWS Chatbot generate the desired IAM configuration. This is possible to define yourself. An approach could be to start with AWS Chatbot generated resources and then replace it your own Terraform resource definitions afterwards.&lt;/p&gt;

&lt;p&gt;I choose Notification, Incident Manager, Resource Explorer and Amazon Q permissions. I do not expect to perform read-only commands, invoke Lambda functions or call AWS Support commands directly from Slack, so I leave these unchecked.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftl93v37fvit2uc9yi6wb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftl93v37fvit2uc9yi6wb.png" width="800" height="811"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Slack channel is now configured.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft7djl0ywpunj2we1gr8r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft7djl0ywpunj2we1gr8r.png" width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you want to add support for notifications from global services producing CloudWatch Metrics in us-east-1, such as Route 53 Health Checks, you can configure an additional SNS topic.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhs6br0bdx94ol36fvdyv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhs6br0bdx94ol36fvdyv.png" width="800" height="711"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h5&gt;
  
  
  Automated deployment with Terraform
&lt;/h5&gt;

&lt;p&gt;A fully working Terraform module is available at &lt;a href="https://github.com/haakond/terraform-aws-chatbot/blob/main/README.md" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-chatbot/blob/main/README.md&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;AWS Chatbot channel configuration for Slack is only supported in the AWS Cloud Control Provider for Terraform, so first we start by declaring the necessary providers:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
      version = "~&amp;gt; 5.53.0"
      configuration_aliases = [aws, aws.us-east-1]
    }
    awscc = {
      source = "hashicorp/awscc"
      version = "~&amp;gt; 1.2.0"
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first resource is the Slack channel configuration with relevant input variables for Slack Channel ID, Slack Workspace ID, and configured SNS topics for primary region and us-east-1, for global service endpoints.&lt;/p&gt;

&lt;p&gt;We then declare the IAM role for this purpose with relevant Managed Policies to also be able to managed Incident Manager and Security Hub findings directly from Slack. Feel free to adjust to your use-case. Lastly relevant SNS topic are configured for relevant regions.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "awscc_chatbot_slack_channel_configuration" "chatbot_slack" {
  configuration_name = var.slack_channel_configuration_name
  iam_role_arn = awscc_iam_role.chatbot_channel_role.arn
  slack_channel_id = var.slack_channel_id
  slack_workspace_id = var.slack_workspace_id
  logging_level = var.logging_level
  sns_topic_arns = [aws_sns_topic.sns_topic_for_aws_chatbot_primary_region.arn, aws_sns_topic.sns_topic_for_aws_chatbot_us_east_1.arn]
  guardrail_policies = [
    "arn:aws:iam::aws:policy/PowerUserAccess"
  ]
}

resource "awscc_iam_role" "chatbot_channel_role" {
  role_name = "aws-chatbot-channel-role"
  assume_role_policy_document = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Sid = "Chatbot"
        Principal = {
          Service = "chatbot.amazonaws.com"
        }
      },
    ]
  })
  managed_policy_arns = [
    "arn:aws:iam::aws:policy/AWSResourceExplorerReadOnlyAccess",
    "arn:aws:iam::aws:policy/AWSIncidentManagerResolverAccess",
    "arn:aws:iam::aws:policy/AmazonQFullAccess",
    "arn:aws:iam::aws:policy/CloudWatchReadOnlyAccess",
    "arn:aws:iam::aws:policy/AWSSecurityHubFullAccess",
    "arn:aws:iam::aws:policy/AWSSupportAccess"
  ]
}

resource "aws_sns_topic" "sns_topic_for_aws_chatbot_primary_region" {
  #checkov:skip=CKV_AWS_26:
  name = "aws-chatbot-notifications"
  http_success_feedback_role_arn = aws_iam_role.delivery_status_logging_for_sns_topic.arn
  http_failure_feedback_role_arn = aws_iam_role.delivery_status_logging_for_sns_topic.arn
  tags = {
    Name = "aws_chatbot_notifications"
    Service = "monitoring"
  }
}

resource "aws_sns_topic" "sns_topic_for_aws_chatbot_us_east_1" {
  #checkov:skip=CKV_AWS_26:
  provider = aws.us-east-1
  name = "aws-chatbot-notifications"
  http_success_feedback_role_arn = aws_iam_role.delivery_status_logging_for_sns_topic.arn
  http_failure_feedback_role_arn = aws_iam_role.delivery_status_logging_for_sns_topic.arn
  tags = {
    Name = "aws_chatbot_notifications"
    Service = "monitoring"
  }
}

# Define SNS topic policy primary region
resource "aws_sns_topic_policy" "sns_topic_policy_for_aws_chatbot_primary_region" {
  arn = aws_sns_topic.sns_topic_for_aws_chatbot_primary_region.arn
  policy = data.aws_iam_policy_document.sns_topic_policy_for_aws_chatbot_primary_region.json
}

# Define SNS topic policy primary region us-east-1
resource "aws_sns_topic_policy" "sns_topic_policy_for_aws_chatbot_us_east_1" {
  provider = aws.us-east-1
  arn = aws_sns_topic.sns_topic_for_aws_chatbot_us_east_1.arn
  policy = data.aws_iam_policy_document.sns_topic_policy_for_aws_chatbot_us_east_1.json
}

# IAM role for delivery_status_logging_for_sns_topic
resource "aws_iam_role" "delivery_status_logging_for_sns_topic" {
  name = "aws-chatbot-delivery-status-logging"
  assume_role_policy = data.aws_iam_policy_document.sns_to_cw_logs_assume_role_policy.json
}

resource "aws_iam_policy" "delivery_status_logging_for_sns_topic_policy" {
  policy = data.aws_iam_policy_document.sns_to_cw_logs_policy.json
}

resource "aws_iam_role_policy_attachment" "delivery_status_logging_for_sns_topic_attachment" {
  role = aws_iam_role.delivery_status_logging_for_sns_topic.name
  policy_arn = aws_iam_policy.delivery_status_logging_for_sns_topic_policy.arn
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Data resources in data.tf for IAM policies etc. are intentionally left out of this blog post, but can be viewed at &lt;a href="https://github.com/haakond/terraform-aws-chatbot/blob/main/data.tf" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-chatbot/blob/main/data.tf&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To deploy this module in your workload account include the following snippets, as explained in &lt;a href="https://github.com/haakond/terraform-aws-chatbot/blob/main/examples/main.tf" rel="noopener noreferrer"&gt;examples/main.tf&lt;/a&gt; and &lt;a href="https://github.com/haakond/terraform-aws-chatbot/blob/main/examples/provider.tf" rel="noopener noreferrer"&gt;examples/provider.tf&lt;/a&gt;.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;module "aws_chatbot_slack" {
  source = "git::https://github.com/haakond/terraform-aws-chatbot.git"
  providers = {
    aws = aws
    aws.us-east-1 = aws.us-east-1
    awscc = awscc
  }
  slack_channel_configuration_name = "slack-hed-aws-monitoring"
  slack_channel_id = "AABBCC001122DD88"
  slack_workspace_id = "ABCD1234EFGH5678"
  logging_level = "INFO"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;provider "aws" {
  region = var.aws_region
  profile = var.profile_cicd
  assume_role {
    role_arn = "arn:aws:iam::${var.aws_account_id}:role/${var.profile_cicd}"
    session_name = "SESSION_NAME"
    external_id = "EXTERNAL_ID"
  }
}

provider "awscc" {
  region = var.aws_region
  profile = var.profile_cicd
  assume_role = {
    role_arn = "arn:aws:iam::${var.aws_account_id}:role/${var.profile_cicd}"
    session_name = "SESSION_NAME"
    external_id = "EXTERNAL_ID"
  }
}

provider "aws" {
  alias = "us-east-1"
  region = "us-east-1"
  profile = var.profile_cicd
  assume_role {
    role_arn = "arn:aws:iam::${var.aws_account_id}:role/${var.profile_cicd}"
    session_name = "SESSION_NAME"
    external_id = "EXTERNAL_ID"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 3: Test notifications
&lt;/h4&gt;

&lt;p&gt;In the Configure channels overview, select the applicable channel and click Send test message.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbmyp7ein9y59yxmu81xl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbmyp7ein9y59yxmu81xl.png" width="800" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Expected result: Two messages, one for each SNS topic in us-east-1 and eu-central-1.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfboe0ge0egixq48yimj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfboe0ge0egixq48yimj.png" width="676" height="305"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I ask Amazon Q via AWS Chatbot about relevant CloudWatch metrics for ECS Fargate container services and get some helpful links in return.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4jg6s2tdifrjf9m1pbr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4jg6s2tdifrjf9m1pbr.png" width="692" height="433"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Sending CloudWatch alarms to Slack
&lt;/h4&gt;

&lt;p&gt;You can set up notifications to Slack for any CloudWatch Metric+Alarm that you care about. It can be CPU and memory utilization, disk free space, database query latency and so on. I set up a Route 53 Health Check to monitor my blog availability. This ensures I am being notified regardless of actual reason.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Define an additional provider for the us-east-1 region.
provider "aws" {
  alias = "us-east-1"
  region = "us-east-1"
  profile = var.profile_cicd
  assume_role {
    role_arn = "arn:aws:iam::${var.aws_account_id}:role/${var.profile_cicd}"
    session_name = "SESSION_NAME"
    external_id = "EXTERNAL_ID"
  }
}

# AWS Route 53 Health Checks and corresponding metrics reside in us-east-1.
resource "aws_route53_health_check" "hedrange_com_about" {
  provider = aws.us-east-1
  fqdn = "hedrange.com"
  port = 443
  type = "HTTPS"
  resource_path = "/about/"
  failure_threshold = "3"
  request_interval = "30"
  measure_latency = true
  invert_healthcheck = false
  regions = ["us-east-1", "us-west-1", "eu-west-1"]
  tags = {
    Name = "health-check-blog"
  }
}

# Since AWS Route 53 Health Checks and corresponding metrics reside in us-east-1, so the alarm has to be provisioned there as well.
resource "aws_cloudwatch_metric_alarm" "healthcheck_hedrange_com_about" {
  provider = aws.us-east-1
  alarm_name = "health-check-blog-about"
  comparison_operator = "LessThanThreshold"
  evaluation_periods = "3"
  metric_name = "HealthCheckStatus"
  namespace = "AWS/Route53"
  period = "60"
  statistic = "Minimum"
  threshold = "1"
  alarm_description = "CRITICAL - https://hedrange.com/about is unavailable!"
  dimensions = {
    HealthCheckId = aws_route53_health_check.hedrange_com_about.id
  }
  alarm_actions = [local.chatbot_sns_topic_arn_us_east_1]
  ok_actions = [local.chatbot_sns_topic_arn_us_east_1]
  tags = {
    Name = "alarm-health-check-blog",
    Severity = "CRITICAL"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To test the notification let’s amend &lt;code&gt;invert_healthcheck = true&lt;/code&gt; and re-provision.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fejjevtgvtv65yyg9jfnh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fejjevtgvtv65yyg9jfnh.png" width="800" height="403"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The CloudWatch alarm changed state from OK to In alarm, sent a notification to the SNS topic in us-east-1 configured for AWS Chatbot which then dispatched the following message to Slack:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkcpvpa6jbl6zznqhumn9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkcpvpa6jbl6zznqhumn9.png" width="785" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Recovery notification:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl28ib91ta614fcqyd3mq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl28ib91ta614fcqyd3mq.png" width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Sending AWS Health notifications and Security Hub findings to Slack
&lt;/h4&gt;

&lt;p&gt;To ensure we also get notified about important AWS Health events and findings in Security Hub we can set up EventBridge rules accordingly.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Eventbridge rule with event pattern to catch high severity Security Hub findings, regardless of product.
resource "aws_cloudwatch_event_rule" "main_securityhub_event_rule" {
  name = "aws-securityhub-rule"
  description = "Capture AWS Security Hub events"

  event_pattern = &amp;lt;&amp;lt;EOF
{
    "source": [
        "aws.securityhub"
     ],
    "detail-type": [
        "Security Hub Findings - Imported"
    ],
    "detail": {
        "findings": {
            "Severity": {
                "Label": ["CRITICAL", "HIGH"]
            }
        }
    }
}
EOF
}

resource "aws_cloudwatch_event_target" "main_securityhub_rule_target_sns_topic_for_aws_chatbot" {
  rule = aws_cloudwatch_event_rule.main_securityhub_event_rule.name
  target_id = "SendToSNS"
  arn = aws_sns_topic.sns_topic_for_aws_chatbot.arn
}

# Eventbridge rule with event pattern to AWS Health notifications
resource "aws_cloudwatch_event_rule" "health_event_rule" {
  name = "aws-health-rule"
  description = "Capture AWS Health events"

  event_pattern = &amp;lt;&amp;lt;EOF
{
    "source": ["aws.health"],
    "detail-type": ["AWS Health Event"]
}
EOF
}

# EventBridge rule target to the SNS topic for AWS Chatbot
resource "aws_cloudwatch_event_target" "health_event_rule_target_sns_topic_for_aws_chatbot" {
  rule = aws_cloudwatch_event_rule.health_event_rule.name
  target_id = "SendToSNS"
  arn = aws_sns_topic.sns_topic_for_aws_chatbot.arn
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is how a Slack message looks like for a High finding in Security Hub about missing IAM Access Analyzer enablement in region eu-north-1.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdmtd0gpuykfraji8iew4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdmtd0gpuykfraji8iew4.png" width="710" height="346"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is how a notification from AWS Health looks like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F07qt7970jcm7n0sopts8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F07qt7970jcm7n0sopts8.png" width="776" height="334"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Systems Manager – Incident Manager
&lt;/h2&gt;

&lt;p&gt;Incident Manager is a service that has gone a bit below the radar. Teams adopting this service directly in their AWS environments should be able to minimize Recovery Time Objective and the consequences outages may have for customer applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/what-is-incident-manager.html" rel="noopener noreferrer"&gt;AWS explains that&lt;/a&gt; Incident Manager helps reduce the time to resolve incidents by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Providing automated plans for efficiently engaging the people responsible for responding to the incidents.&lt;/li&gt;
&lt;li&gt;Providing relevant troubleshooting data.&lt;/li&gt;
&lt;li&gt;Enabling automated response actions by using predefined Automation runbooks.&lt;/li&gt;
&lt;li&gt;Providing methods to collaborate and communicate with all stakeholders.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Response plans and incident severity classification
&lt;/h3&gt;

&lt;p&gt;Events on the AWS platform can trigger Incidents using pre-defined Response Plans to get the attention of first responders, to quickly start troubleshooting while communicating efficiently with the AWS Chatbot integration for Slack and Microsoft Teams.&lt;/p&gt;

&lt;p&gt;Based on impact and scope one can differentiate alarms and notifications differentiate between urgency, escalation and resolution procedures.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Impact code&lt;/th&gt;
&lt;th&gt;Impact name&lt;/th&gt;
&lt;th&gt;Sample defined scope&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Critical&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full application failure that impacts most customers.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;High&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full application failure that impacts a subset of customers.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Medium&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Partial application failure that is customer-impacting.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;4&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Low&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Intermittent failures that have limited impact on customers.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;5&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;No Impact&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Customers aren’t currently impacted but urgent action is needed to avoid impact.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Example triage table ref. &lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/incident-lifecycle.html#triage" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/incident-manager/latest/userguide/incident-lifecycle.html#triage&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Notification deduplication
&lt;/h3&gt;

&lt;p&gt;A key feature of AWS SSM Incident Manager is the incident &lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/response-plans.html" rel="noopener noreferrer"&gt;deduplication feature&lt;/a&gt;, which ensures grouping of similar notifications, as opposed to direct notifications from CloudWatch =&amp;gt; SNS =&amp;gt; Slack. Incident Manager automatically deduplicates multiple incidents created by the same Amazon CloudWatch alarm or Amazon EventBridge event. This can reduce alert fatigue and ensure critical notifications aren’t missed.&lt;/p&gt;

&lt;p&gt;The purpose of this blog post is not to deep dive into the service itself, but to demonstrate how deployment can be automated with Terraform. For more information read &lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/incident-lifecycle.html" rel="noopener noreferrer"&gt;The incident lifecycle in Incident Manager&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlj6b9tp5kjhhi2h42vx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlj6b9tp5kjhhi2h42vx.png" width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Diagram courtesy of AWS. Source: &lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/incident-lifecycle.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/incident-manager/latest/userguide/incident-lifecycle.html&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Setting up AWS Systems Manager – Incident Manager
&lt;/h3&gt;

&lt;p&gt;At first glance most of the configuration options were not available in the official &lt;a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ssmincidents_replication_set" rel="noopener noreferrer"&gt;aws&lt;/a&gt;Terraform provider, so many steps had to be configured manually. However, after further searching, I realized that the AWS Cloud Control Terraform provider, &lt;a href="https://registry.terraform.io/providers/hashicorp/awscc/latest/docs/" rel="noopener noreferrer"&gt;awscc&lt;/a&gt;, supported the particular resources, so I succeeded in defining a fully working Terraform module for this purpose. I’ve only parameterized the most relevant values. There are many configuration options to tweak so I didn’t make a fully customizable module at this point in time. One approach could be to start here to become familiar and then optimize later on.&lt;/p&gt;
&lt;h4&gt;
  
  
  Configure AWS Systems Manager for your applicable regions with region failover replication set
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_ssmincidents_replication_set" "default" {
  region {
    name = local.current_region
  }
  region {
    name = var.replication_set_fallback_region
  }

  tags = {
    Name = "default"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Configure contacts for your on-call team
&lt;/h4&gt;

&lt;p&gt;I have one primary contact for myself with contact methods email, SMS and voice/phone.&lt;/p&gt;

&lt;p&gt;Do not that the resource &lt;a href="https://registry.terraform.io/providers/hashicorp/awscc/latest/docs/resources/ssmcontacts_contact" rel="noopener noreferrer"&gt;awscc_ssmcontacts_contact&lt;/a&gt; is based on the &lt;a href="https://github.com/hashicorp/terraform-provider-awscc" rel="noopener noreferrer"&gt;AWS Cloud Control provider&lt;/a&gt;. It it’s not that intuitive that this configuration results in a on-call schedule, so I spent quite some time figuring this out.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_ssmcontacts_contact" "primary_contact" {
  alias = var.primary_contact_alias
  display_name = var.primary_contact_display_name
  type = "PERSONAL"

  tags = {
    key = "primary-contact"
  }
  depends_on = [aws_ssmincidents_replication_set.default]
}

resource "aws_ssmcontacts_contact_channel" "primary_contact_email" {
  contact_id = aws_ssmcontacts_contact.primary_contact.arn

  delivery_address {
    simple_address = var.primary_contact_email_address
  }

  name = "primary-contact-email"
  type = "EMAIL"
}

resource "aws_ssmcontacts_contact_channel" "primary_contact_sms" {
  contact_id = aws_ssmcontacts_contact.primary_contact.arn

  delivery_address {
    simple_address = var.primary_contact_phone_number
  }

  name = "primary-contact-sms"
  type = "SMS"
}

resource "aws_ssmcontacts_contact_channel" "primary_contact_voice" {
  contact_id = aws_ssmcontacts_contact.primary_contact.arn

  delivery_address {
    simple_address = var.primary_contact_phone_number
  }

  name = "primary-contact-voice"
  type = "VOICE"
}

resource "aws_ssmcontacts_plan" "primary_contact" {
  contact_id = aws_ssmcontacts_contact.primary_contact.arn

  stage {
    duration_in_minutes = 1

    target {
      channel_target_info {
        retry_interval_in_minutes = 5
        contact_channel_id = aws_ssmcontacts_contact_channel.primary_contact_email.arn
      }
    }
  }
  stage {
    duration_in_minutes = 5

    target {
      channel_target_info {
        retry_interval_in_minutes = 5
        contact_channel_id = aws_ssmcontacts_contact_channel.primary_contact_sms.arn
      }
    }
  }
  stage {
    duration_in_minutes = 10

    target {
      channel_target_info {
        retry_interval_in_minutes = 5
        contact_channel_id = aws_ssmcontacts_contact_channel.primary_contact_voice.arn
      }
    }
  }
}

resource "awscc_ssmcontacts_contact" "oncall_schedule" {

  alias = "default-schedule"
  display_name = "default-schedule"
  type = "ONCALL_SCHEDULE"
  plan = [{
    rotation_ids = [aws_ssmcontacts_rotation.business_hours.id]
  }]
  depends_on = [aws_ssmincidents_replication_set.default]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Configure response plan – Example for type Critical Incident Process
&lt;/h4&gt;

&lt;p&gt;You can have as many response plans as you like, but each one of them are as of June 2024 priced at &lt;a href="https://aws.amazon.com/systems-manager/pricing/" rel="noopener noreferrer"&gt;$7 on a monthly basis&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;One consideration could be to have Critical and High response plans as a starting point and dispatch alarms accordingly.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_ssmincidents_response_plan" "critical_incident" {
  name = "CRITICAL-INCIDENT"

  incident_template {
    title = "CRITICAL-INCIDENT"
    impact = "1"
    incident_tags = {
      Name = "CRITICAL-INCIDENT"
    }

    summary = "Follow CRITICAL INCIDENT process."
  }

  display_name = "CRITICAL-INCIDENT"
  chat_channel = [var.chatbot_sns_topic_notification_arn]
  engagements = [awscc_ssmcontacts_contact.oncall_schedule.arn]

  action {
    ssm_automation {
      document_name = aws_ssm_document.critical_incident_runbook.arn
      role_arn = aws_iam_role.service_role_for_ssm_incident_manager.arn
      document_version = "$LATEST"
      target_account = "RESPONSE_PLAN_OWNER_ACCOUNT"
      parameter {
        name = "Environment"
        values = ["Production"]
      }
      dynamic_parameters = {
        resources = "INVOLVED_RESOURCES"
        incidentARN = "INCIDENT_RECORD_ARN"
      }
    }
  }

  tags = {
    Name = "critical-incident-response-plan"
  }

  depends_on = [aws_ssmincidents_replication_set.default]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Configure on-call schedule
&lt;/h4&gt;

&lt;p&gt;In this example the on-call rotation schedule consists of only the primary contact. In a full deployment an on-call team normally consists of several people on a weekly schedule, with escalation/fallback. Feel free to adjust to your use-case.&lt;/p&gt;

&lt;p&gt;Start of each shift is every Monday at 09:00 and people will be notified from 08:30 – 16:00, during business hours. You can choose any time period of the day, for instance have one on-call schedule during business hours and another one outside of business hours.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_ssmcontacts_rotation" "business_hours" {
  contact_ids = [
    aws_ssmcontacts_contact.primary_contact.arn
  ]

  name = "business-hours"

  recurrence {
    number_of_on_calls = 1
    recurrence_multiplier = 1
    weekly_settings {
      day_of_week = "MON"
      hand_off_time {
        hour_of_day = 09
        minute_of_hour = 00
      }
    }

    weekly_settings {
      day_of_week = "FRI"
      hand_off_time {
        hour_of_day = 15
        minute_of_hour = 55
      }
    }

    shift_coverages {
      map_block_key = "MON"
      coverage_times {
        start {
          hour_of_day = 08
          minute_of_hour = 30
        }
        end {
          hour_of_day = 16
          minute_of_hour = 00
        }
      }
    }
    shift_coverages {
      map_block_key = "TUE"
      coverage_times {
        start {
          hour_of_day = 08
          minute_of_hour = 30
        }
        end {
          hour_of_day = 16
          minute_of_hour = 00
        }
      }
    }
    shift_coverages {
      map_block_key = "WED"
      coverage_times {
        start {
          hour_of_day = 08
          minute_of_hour = 30
        }
        end {
          hour_of_day = 16
          minute_of_hour = 00
        }
      }
    }
    shift_coverages {
      map_block_key = "THU"
      coverage_times {
        start {
          hour_of_day = 08
          minute_of_hour = 30
        }
        end {
          hour_of_day = 16
          minute_of_hour = 00
        }
      }
    }
    shift_coverages {
      map_block_key = "FRI"
      coverage_times {
        start {
          hour_of_day = 08
          minute_of_hour = 30
        }
        end {
          hour_of_day = 16
          minute_of_hour = 00
        }
      }
    }
  }

  start_time = var.rotation_start_time
  time_zone_id = "Europe/Oslo"
  depends_on = [aws_ssmincidents_replication_set.default]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Configure Systems Manager Runbook to be executed for the Critical Incident Process
&lt;/h4&gt;

&lt;p&gt;The AWS Systems Manager Runbook AWSIncidents-CriticalIncidentRunbookTemplate can be used as a starting point for this use-case. For more information see &lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/runbooks.html" rel="noopener noreferrer"&gt;Working with Systems Manager Automation runbooks in Incident Manager&lt;/a&gt;.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_ssm_document" "critical_incident_runbook" {
  name = "critical_incident_runbook"
  document_type = "Automation"
  document_format = "YAML"
  content = &amp;lt;&amp;lt;DOC
#
# Original source: AWSIncidents-CriticalIncidentRunbookTemplate
#
# Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this
# software and associated documentation files (the "Software"), to deal in the Software
# without restriction, including without limitation the rights to use, copy, modify,
# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#
---
description: "This document is intended as a template for an incident response runbook in [Incident Manager](https://docs.aws.amazon.com/incident-manager/latest/userguide/index.html).\n\nFor optimal use, create your own automation document by copying the contents of this runbook template and customizing it for your scenario. Then, navigate to your [Response Plan](https://console.aws.amazon.com/systems-manager/incidents/response-plans/home) and associate it with your new automation document; your runbook is automatically started when an incident is created with the associated response plan. For more information, see [Incident Manager - Runbooks](https://docs.aws.amazon.com/incident-manager/latest/userguide/runbooks.html). \v\n\nSuggested customizations include:\n* Updating the text in each step to provide specific guidance and instructions, such as commands to run or links to relevant dashboards\n* Automating actions before triage or diagnosis to gather additional telemetry or diagnostics using aws:executeAwsApi\n* Automating actions in mitigation using aws:executeAutomation, aws:executeScript, or aws:invokeLambdaFunction\n"
schemaVersion: '0.3'
parameters:
  Environment:
    type: String
  incidentARN:
    type: String
  resources:
    type: String
mainSteps:
  - name: Triage
    action: 'aws:pause'
    inputs: {}
    description: |-
      **Determine customer impact**

      * View the **Metrics** tab of the incident or navigate to your [CloudWatch Dashboards](https://console.aws.amazon.com/cloudwatch/home#dashboards:) to find key performance indicators (KPIs) that show the extent of customer impact.
      * Use [CloudWatch Synthetics](https://console.aws.amazon.com/cloudwatch/home#synthetics:) and [Contributor Insights](https://console.aws.amazon.com/cloudwatch/home#contributorinsights:) to identify real-time failures in customer workflows.

      **Communicate customer impact**

      Update the following fields to accurately describe the incident:
      * **Title** - The title should be quickly recognizable by the team and specific to the particular incident.
      * **Summary** - The summary should contain the most important and up-to-date information to quickly onboard new responders to the incident.
      * **Impact** - Select one of the following impact ratings to describe the incident:
        * 1 – Critical impact, full application failure that impacts many to all customers.
        * 2 – High impact, partial application failure with impact to many customers.
        * 3 – Medium impact, the application is providing reduced service to many customers.
        * 4 – Low impact, the application is providing reduced service to few customers.
        * 5 – No impact, customers are not currently impacted but urgent action is needed to avoid impact.
  - name: Diagnosis
    action: 'aws:pause'
    inputs: {}
    description: |
      **Rollback**

      * Look for recent changes to the production environment that might have caused the incident. Engage the responsible team using the **Contacts** tab of the incident.
      * Rollback these changes if possible.

      **Locate failures**
      * Review metrics and alarms related to your [Application](https://console.aws.amazon.com/systems-manager/appmanager/applications). Add any related metrics and alarms to the **Metrics** tab of the incident.
      * Use [CloudWatch ServiceLens](https://console.aws.amazon.com/cloudwatch/home#servicelens:) to troubleshoot issues across multiple services.
      * Investigate the possibility of ongoing incidents across your organization. Check for known incidents and issues in AWS using [Personal Health Dashboard](https://console.aws.amazon.com/systems-manager/insights). Add related links to the **Related Items** tab of the incident.
      * Avoid going too deep in diagnosing the failure and focus on how to mitigate the customer impact. Update the **Timeline** tab of the incident when a possible diagnosis is identified.
  - name: Mitigation
    action: 'aws:pause'
    description: |-
      **Collaborate**
      * Communicate any changes or important information from the previous step to the members of the associated chat channel for this incident. Ask for input on possible ways to mitigate customer impact.
      * Engage additional contacts or teams using their escalation plan from the **Contacts** tab.
      * If necessary, prepare an emergency change request in [Change Manager](https://console.aws.amazon.com/systems-manager/change-manager).

      **Implement mitigation**
      * Consider re-routing customer traffic or throttling incoming requests to reduce customer impact.
      * Look for common runbooks in [Automation](https://console.aws.amazon.com/systems-manager/automation) or run commands using [Run Command](https://.console.aws.amazon.com/systems-manager/run-command).
      * Update the **Timeline** tab of the incident when a possible mitigation is identified. If needed, review the mitigation with others in the associated chat channel before proceeding.
    inputs: {}
  - name: Recovery
    action: 'aws:pause'
    inputs: {}
    description: |-
      **Monitor customer impact**
      * View the **Metrics** tab of the incident to monitor for recovery of your key performance indicators (KPIs).
      * Update the **Impact** field in the incident when customer impact has been reduced or resolved.

      **Identify action items**
      * Add entries in the **Timeline** tab of the incident to record key decisions and actions taken, including temporary mitigations that might have been implemented.
      * Create a **Post-Incident Analysis** when the incident is closed in order to identify and track action items in [OpsCenter](https://console.aws.amazon.com/systems-manager/opsitems).

DOC
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Deploying AWS Systems Manager – Incident Manager with Terraform
&lt;/h4&gt;

&lt;p&gt;In addition to the snippets mentioned above there are resource configurations for IAM and Terraform data objects. To keep in mind the length of this blog post I therefore only refer to them in the full module code repository.&lt;/p&gt;

&lt;p&gt;This is how I configure the module call in my workload provisioning pipeline. In a production environment in a DevOps model I would provision this module in every production workload account a DevOps team is responsible for, along with the applications and CloudWatch monitoring data. Feel free to adjust to how your organization is set up.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Use aws and awscc providers to provision SSM Incident Manager
# Ref. https://registry.terraform.io/providers/hashicorp/aws/latest/docs/guides/using-aws-with-awscc-provider
# Refers to output from the previous module provision for AWS Chatbot. 
module "ssm_incident_manager" {
  source = "git::https://github.com/haakond/terraform-aws-ssm-incident-manager.git?ref=dev"
  providers = {
    aws = aws
    awscc = awscc
  }
  primary_contact_alias = "primary-contact"
  primary_contact_display_name = "Håkon Eriksen Drange"
  primary_contact_email_address = "alpha.bravo@charlie-company.com"
  primary_contact_phone_number = "+4799887766"
  chatbot_sns_topic_notification_arn = module.aws_chatbot_slack.chatbot_sns_topic_arn_primary_region
  rotation_start_time = "2024-06-24T07:00:00+00:00"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since we had to use a combination of the aws and awscc Terraform providers, remember to include something similar in your provider.tf:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;provider "aws" {
  region = var.aws_region
  profile = var.profile_cicd
  assume_role {
    role_arn = "arn:aws:iam::${var.aws_account_id}:role/${var.profile_cicd}"
    session_name = "SESSION_NAME"
    external_id = "EXTERNAL_ID"
  }
}

provider "awscc" {
  region = var.aws_region
  profile = var.profile_cicd
  assume_role = {
    role_arn = "arn:aws:iam::${var.aws_account_id}:role/${var.profile_cicd}"
    session_name = "SESSION_NAME"
    external_id = "EXTERNAL_ID"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Module reference with example: &lt;a href="https://github.com/haakond/terraform-aws-ssm-incident-manager/blob/main/examples/main.tf" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-ssm-incident-manager/blob/main/examples/main.tf&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using AWS SSM Incident Manager
&lt;/h3&gt;

&lt;p&gt;Provisioning this module should yield the following results.&lt;/p&gt;

&lt;h4&gt;
  
  
  Activate contact channels
&lt;/h4&gt;

&lt;p&gt;The first step is to go to Contacts and activate the configured channels.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbxqr45jf90z039g8yupc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbxqr45jf90z039g8yupc.png" width="800" height="165"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You will receive a one time passcode to each of the configured channels. Each one of them needs to be activated before the contact is enabled.&lt;/p&gt;

&lt;h4&gt;
  
  
  On-call schedule – calendar overview
&lt;/h4&gt;

&lt;p&gt;The configured on-call schedule looks like this. The team will only be notified through the configured contact channels during business hours. I would also set up an out-of-business-hours schedule with the appropriate configuration.&lt;/p&gt;

&lt;p&gt;It’s also possible to create shift overrides, in case a team member is asked to cover for a sick colleague etc.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fika2tl5y550f95r5dmwz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fika2tl5y550f95r5dmwz.png" width="800" height="415"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Fire drill
&lt;/h3&gt;

&lt;p&gt;I provisioned a temporary Amazon Linux 2 EC2 instance and set up a CloudWatch alarm to simulate unusual high CPU utilization over an extended period of time where Auto Scaling was not able to provision enough capacity for the load spike. This can be any CloudWatch alarm. Route 53 Health Checks and CloudWatch Synthetics monitors are also be good candidates.&lt;/p&gt;

&lt;p&gt;Alarm and OK actions are configured with the relevant SNS topics for AWS Chatbot provisioned in the previous module.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_cloudwatch_metric_alarm" "demo_ec2_instance_cpu_utilization" {
  alarm_name = "CRITICAL-EC2-CPU_i-04f7af9ea5374c4d8"
  comparison_operator = "GreaterThanOrEqualToThreshold"
  evaluation_periods = "3"
  metric_name = "CPUUtilization"
  namespace = "AWS/EC2"
  period = "300"
  statistic = "Average"
  threshold = "80"
  alarm_description = "CRITICAL CPU utilization for instance ID i-04f7af9ea5374c4d8!"
  dimensions = {
    InstanceId = "i-04f7af9ea5374c4d8"
  }
  alarm_actions = [module.aws_chatbot_slack.chatbot_sns_topic_arn_primary_region, module.ssm_incident_manager.critical_incident_response_plan_arn]
  ok_actions = [module.aws_chatbot_slack.chatbot_sns_topic_arn_primary_region]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I log in to the demo EC2 instance through AWS Systems Managed – Session Manager and run the following commands to generate high CPU load for 30 minutes.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sudo amazon-linux-extras install epel -ysudo yum install stress -ystress --cpu 4 --timeout 30m&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The CloudWatch Alarm dispatches to SNS for AWS Chatbot and the result is the following message on Slack:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F20r3zz8jehtxyiicqpge.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F20r3zz8jehtxyiicqpge.png" width="784" height="825"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The CloudWatch Alarm’s secondary action is to trigger an AWS Systems Manager – Incident Manager Response Plan, which also is set up to post to Slack:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtnrnjohbq5r69733g0z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtnrnjohbq5r69733g0z.png" width="758" height="526"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After 1 minute, as per the contact plan configuration, I receive this email:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fydrgqg71814vjz0z5gzc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fydrgqg71814vjz0z5gzc.png" width="645" height="503"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Shortly after I receive this SMS:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq1kf2zbeah6w9cyb33oc.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq1kf2zbeah6w9cyb33oc.jpg" width="800" height="730"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I then follow the instructions to acknowledge the incident:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqtpqbpb0bfkxovqwt5kj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqtpqbpb0bfkxovqwt5kj.png" width="800" height="253"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If this was during the night and I was asleep, the next phase would be an automated phone call.&lt;/p&gt;

&lt;p&gt;My team-mates can see on Slack that I have assumed ownership of the incident.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqqgl4js6vdkuuvh7bj3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftqqgl4js6vdkuuvh7bj3.png" width="602" height="150"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is how the incident is tracked in AWS Systems Manager – Incident Manager. The SSM runbook as deployed with Terraform is triggered which provides guidance on how to handle the process of type Critical Incident.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx1yx9yj2al1bwodovzlk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx1yx9yj2al1bwodovzlk.png" width="800" height="367"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The runbook lays out the process for us and the first step is to examine the customer impact, called Triage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkdwcrqii1lp2it0tgne.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkdwcrqii1lp2it0tgne.png" width="800" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The metric for the CloudWatch Alarm which triggered the Incident is automatically included in the Incident Overview.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc8ksf4eexmpo6783y7pr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc8ksf4eexmpo6783y7pr.png" width="800" height="395"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Clear instructions are provided for the next phases as well. You can customize all this in your own company runbook.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fybozq1x45fvgcyem1vzt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fybozq1x45fvgcyem1vzt.png" width="800" height="492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The situation recovered so we complete the Recovery section of the runbook.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5r6ck9v8yru6v0lun068.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5r6ck9v8yru6v0lun068.png" width="749" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Incident Manager provides a full timeline overview of all events, for documentation and further forensics.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6x2evt9j5q3izyzc7o8f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6x2evt9j5q3izyzc7o8f.png" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can then create a post-incident analysis / post-mortem analysis using a recommended template (which also can be customized).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fydhnv0s2kp1c9ljjg7tj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fydhnv0s2kp1c9ljjg7tj.png" width="609" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyd67aqeyjm5lqr9vtsrk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyd67aqeyjm5lqr9vtsrk.png" width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can pull in relevant CloudWatch metrics to put the event into perspective&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs8adycylkkzs5cfidj7o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs8adycylkkzs5cfidj7o.png" width="800" height="348"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sample questions for a blameless, constructive review about areas that could be improved are also provided:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fhedrange.com%2Fwp-content%2Fuploads%2F2024%2F06%2Fimage-28-1024x584-640x480.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fhedrange.com%2Fwp-content%2Fuploads%2F2024%2F06%2Fimage-28-1024x584-640x480.png" title="image-28" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Follow-up action items can be defined as well.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdvenaqwzyngyveowsw9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdvenaqwzyngyveowsw9.png" width="800" height="117"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As we can see AWS Systems Manager – Incident Manager provides functionality for handling the entire lifecycle of critical events.&lt;/p&gt;

&lt;h3&gt;
  
  
  Incident Manager Pricing
&lt;/h3&gt;

&lt;p&gt;AWS SSM Incident Manager pricing is &lt;a href="https://aws.amazon.com/systems-manager/pricing/" rel="noopener noreferrer"&gt;$7&lt;/a&gt; per Response Plan per month. 100 SMS &amp;amp; Voice messages are included free of charge. Destination country rates can be found here: &lt;a href="https://aws.amazon.com/systems-manager/pricing/country-rates/" rel="noopener noreferrer"&gt;https://aws.amazon.com/systems-manager/pricing/country-rates/&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this blog post we explored a solution to ensure operational events are handled efficiently with the primary objective of restoring quality of service and end user experience as quickly as possible. We saw how AWS Chatbot and AWS SSM Incident Manager integrates nicely with AWS services such as AWS Health, AWS Security Hub and any AWS CloudWatch Alarms. Making operational information and CloudWatch metrics available in Slack/Microsoft Teams, where most of the daily interaction takes place, is something that I personally appreciate. Most people have Slack/Teams on their mobile devices so this can really increase the quality of internal communication by not having to log in to the AWS Console or 3rd party systems.&lt;/p&gt;

&lt;p&gt;By following the described steps rooted in the AWS Well-Architected Framework and deploying the provided Terraform sample code organizations can improve their operational procedures to increase resiliency and work smart.&lt;/p&gt;

&lt;p&gt;Terraform module for AWS Chatbot: &lt;a href="https://github.com/haakond/terraform-aws-chatbot/blob/main/README.md" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-chatbot/blob/main/README.md&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Terraform module for AWS Systems Manager Incident Manager: &lt;a href="https://github.com/haakond/terraform-aws-ssm-incident-manager/blob/main/README.md" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-ssm-incident-manager/blob/main/README.md&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/operational-excellence-pillar/welcome.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework – Operational Excellence Pillar whitepaper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/welcome.html?ref=wellarchitected-wp" rel="noopener noreferrer"&gt;AWS Well-Architected Framework – Reliability Pillar whitepaper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/chatbot/latest/adminguide/what-is.html" rel="noopener noreferrer"&gt;What is AWS Chatbot?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/chatbot/latest/adminguide/performing-actions.html" rel="noopener noreferrer"&gt;AWS Chatbot – Performing actions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/what-is-incident-manager.html" rel="noopener noreferrer"&gt;AWS Systems Manager Incident Manager introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/incident-response.html" rel="noopener noreferrer"&gt;Preparing for incidents in Incident Manager&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/analysis.html" rel="noopener noreferrer"&gt;Performing a post-incident analysis in Incident Manager&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/incident-manager/latest/userguide/integration.html" rel="noopener noreferrer"&gt;Product and service integrations with Incident Manager&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2024/05/31/never-miss-an-alert-with-aws-chatbot-and-aws-ssm-incident-manager/" rel="noopener noreferrer"&gt;Never miss an alert with AWS Chatbot and AWS SSM – Incident Manager&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>aws</category>
      <category>chatbot</category>
      <category>slack</category>
    </item>
    <item>
      <title>Protect your webapps from malicious traffic with AWS Web Application Firewall</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Thu, 23 May 2024 12:56:54 +0000</pubDate>
      <link>https://forem.com/haakoned/protect-your-webapps-from-malicious-traffic-with-aws-web-application-firewall-39o</link>
      <guid>https://forem.com/haakoned/protect-your-webapps-from-malicious-traffic-with-aws-web-application-firewall-39o</guid>
      <description>&lt;p&gt;&lt;strong&gt;Table of contents&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
Introduction

&lt;ul&gt;
&lt;li&gt;The Web Application Firewall concept&lt;/li&gt;
&lt;li&gt;Deployment options&lt;/li&gt;
&lt;li&gt;Option #1 – Application layer&lt;/li&gt;
&lt;li&gt;Option #2 – Webserver module&lt;/li&gt;
&lt;li&gt;Option #3 – Virtual appliance&lt;/li&gt;
&lt;li&gt;
Option #4 – AWS native service – Web Application Firewall

&lt;ul&gt;
&lt;li&gt;AWS native service – WAF – Regional deployment&lt;/li&gt;
&lt;li&gt;AWS native service – WAF – Global edge network&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Introduction to how AWS WAF works&lt;/li&gt;

&lt;li&gt;AWS WAF Traffic dashboard insight&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;AWS WAF provisioning with Terraform&lt;/li&gt;

&lt;li&gt;AWS WAF pricing&lt;/li&gt;

&lt;li&gt;Conclusion and recommendations&lt;/li&gt;

&lt;li&gt;References and additional resources&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;As a rapidly increasing amount of companies are moving workloads to the cloud and extending their footprint through refactoring and modernization, the possible attack vectors are increasingly expanding. According to &lt;a href="https://www.datadoghq.com/state-of-cloud-security/" rel="noopener noreferrer"&gt;DataDogs State of Cloud Security Report&lt;/a&gt;, a substantial portion of cloud workloads are excessively privileged [&lt;a href="https://www.datadoghq.com/state-of-cloud-security/#5" rel="noopener noreferrer"&gt;FACT 5&lt;/a&gt;] and many virtual machines are publicly exposed to the internet [&lt;a href="https://www.datadoghq.com/state-of-cloud-security/#6" rel="noopener noreferrer"&gt;FACT 6&lt;/a&gt;].&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;In AWS, only a small number (1.5 percent) of Amazon EC2 instances have full administrator privileges. Overall, nearly one in four EC2 instances (23 percent) have administrator or highly sensitive permissions to the AWS account they run in. An attacker does not need full administrator privileges to have a substantial impact—there are other, more common and challenging-to-uncover types of permissions they can leverage. We found that:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;_ &lt;strong&gt;5.4 percent of EC2 instances have risky permissions that allow lateral movement&lt;/strong&gt;  in the account, such as connecting to other instances using SSM Sessions Manager._&lt;/li&gt;
&lt;li&gt;_ &lt;strong&gt;7.2 percent allow an attacker to gain full administrative access&lt;/strong&gt;  to the account by privilege escalation, such as permissions to create a new IAM user with administrator privileges._&lt;/li&gt;
&lt;li&gt;_ &lt;strong&gt;20 percent have excessive data access&lt;/strong&gt; , such as listing and accessing data from all S3 buckets in the account._
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;(Note that these conditions are not mutually exclusive—a specific instance can fall into several of these categories.)&lt;/em&gt;&lt;br&gt;
&lt;cite&gt;&lt;em&gt;&lt;a href="https://www.datadoghq.com/state-of-cloud-security/#5" rel="noopener noreferrer"&gt;FACT 5&lt;/a&gt;: A substantial portion of cloud workloads are excessively privileged – DataDog State of Cloud Security &lt;/em&gt;&lt;/cite&gt;&lt;/p&gt;

&lt;p&gt;_ &lt;strong&gt;7 percent of EC2 instances&lt;/strong&gt; ,  &lt;strong&gt;3 percent of Azure VMs&lt;/strong&gt; , and  &lt;strong&gt;13 percent of Google Cloud VMs&lt;/strong&gt;  are publicly exposed to the internet. Among instances that are publicly exposed, HTTP and HTTPS are the most commonly exposed ports, and are not considered risky in general. After these, SSH and RDP remote access protocols are common._&lt;/p&gt;

&lt;p&gt;&lt;cite&gt;&lt;em&gt;&lt;a href="https://www.datadoghq.com/state-of-cloud-security/#6" rel="noopener noreferrer"&gt;FACT 6&lt;/a&gt;: Many virtual machines remain publicly exposed to the internet – DataDog State of Cloud Security&lt;/em&gt;&lt;/cite&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Scanning of public internet resources is happening all the time while malicious actors are becoming more and more sophisticated. Companies can have a tight perimeter for employee IAM-credentials with access to the AWS Console/CLI, but still have the back door wide open with unsecured web applications.&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/security.html" rel="noopener noreferrer"&gt;Security Pillar&lt;/a&gt; of the &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/welcome.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framewor&lt;/a&gt;k &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/sec-design.html" rel="noopener noreferrer"&gt;a key design principle&lt;/a&gt; is to apply security at all layers with a defense in depth approach for edge of network, VPC, load balancing, instance/compute, operating system, application and code.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Web Application Firewall concept
&lt;/h3&gt;

&lt;p&gt;You may be familiar with traditional firewalls that function on Layer 3/4 by defining rules for protocols and port ranges. As the HTTP protocol is a &lt;a href="https://en.wikipedia.org/wiki/OSI_model#Layer_architecture" rel="noopener noreferrer"&gt;Layer 7 construct&lt;/a&gt; we need something a bit more advanced to inspect, monitor, filter/block unwanted requests to and from a web service.&lt;/p&gt;

&lt;p&gt;A Web Application Firewall can help by preventing attacks exploiting a web application’s known vulnerabilities. The &lt;a href="https://owasp.org/Top10/#welcome-to-the-owasp-top-10-2021" rel="noopener noreferrer"&gt;OWASP Top Ten list&lt;/a&gt; defines the most common attack vectors and is updated on a regular basis:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Broken Access Control

&lt;ul&gt;
&lt;li&gt;Violation of principle of least privilege/elevation of privilege&lt;/li&gt;
&lt;li&gt;Bypassing access control checks/viewing someone else’s account&lt;/li&gt;
&lt;li&gt;Lack of multi-factor authentication&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Cryptographic Failures

&lt;ul&gt;
&lt;li&gt;Lack of proper encryption at rest/in transit&lt;/li&gt;
&lt;li&gt;Old or weak cryptographic algorithms (not modern TLS)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Injection

&lt;ul&gt;
&lt;li&gt;Lack of proper input validation/sanitation&lt;/li&gt;
&lt;li&gt;SQL injection&lt;/li&gt;
&lt;li&gt;Cross-site scripting (XSS)&lt;/li&gt;
&lt;li&gt;OS command, ORM, LDAP injection&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Insecure Design&lt;/li&gt;
&lt;li&gt;Security Misconfiguration&lt;/li&gt;
&lt;li&gt;Vulnerable and Outdated Components

&lt;ul&gt;
&lt;li&gt;Operating system, web/application server, DBMS etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Identification and Authentication Failures&lt;/li&gt;
&lt;li&gt;Brute force or automated attacks&lt;/li&gt;
&lt;li&gt;Software and Data Integrity Failures&lt;/li&gt;
&lt;li&gt;Security Logging and Monitoring Failures

&lt;ul&gt;
&lt;li&gt;Insufficient audit logging&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Server-Side Request Forgery&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One caveat with WAF is, that depending on the deployment model, it may be resource intensive. All sorts of inspection and filtering generates CPU load, which in a traditional datacenter environment means the web server has less resource capacity for serving legitimate traffic, leading to performance degradation, especially while under heavy load or attack.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployment options
&lt;/h3&gt;

&lt;p&gt;A number of Web Application Firewall solutions are available, each with it’s pros and cons.&lt;/p&gt;

&lt;h4&gt;
  
  
  Option #1 – Application layer
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlmjpovu48s4edr8a13m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlmjpovu48s4edr8a13m.png" width="800" height="564"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The first option describes how to implement WAF like capabilities either in your own code base or including a framework component such as &lt;a href="https://shieldon.io/en" rel="noopener noreferrer"&gt;ShieldON for PHP&lt;/a&gt;. Although it may seem handy this has directly integrated with your application code and since it’s deployed on the same compute option (VM or container) this option comes with the most severe performance impact.&lt;/p&gt;

&lt;h4&gt;
  
  
  Option #2 – Webserver module
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1u2vlw039cf6a4rnybo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1u2vlw039cf6a4rnybo.png" width="800" height="564"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;By moving up one level in the stack and configuring WAF as a module in your web server of choice you can decouple from your application code base.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/owasp-modsecurity/ModSecurity" rel="noopener noreferrer"&gt;ModSecurity&lt;/a&gt;is a traditional option for &lt;a href="https://www.linode.com/docs/guides/securing-apache2-with-modsecurity/" rel="noopener noreferrer"&gt;Apache&lt;/a&gt; and &lt;a href="https://www.linode.com/docs/guides/securing-nginx-with-modsecurity/" rel="noopener noreferrer"&gt;Nginx&lt;/a&gt;web servers. If deployed on the same instance both your application and the WAF would compete for CPU resources. In some situations WAF can be extracted and deployed as a separate proxy tier, as further described below.&lt;/p&gt;

&lt;h4&gt;
  
  
  Option #3 – Virtual appliance
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxzhu0zewsoycjj4pioh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuxzhu0zewsoycjj4pioh.png" width="800" height="564"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this model the WAF component is separated from the application layer so that it can be managed and scaled independently. The main advantage is that malicious traffic can be blocked before reaching the application compute resources, ensuring maximum performance for legitimate requests. This also opens up the possibility to share the WAF component across application workloads in the same region.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/marketplace/solutions/security/web-application-firewall?aws-marketplace-cards.sort-by=item.additionalFields.sortOrder&amp;amp;aws-marketplace-cards.sort-order=asc" rel="noopener noreferrer"&gt;AWS Marketplace has multiple options for Cloud WAF-as-a-Service virtual appliances&lt;/a&gt; and most premium solutions can be integrated with existing enterprise management, logging and reporting tools. Some also deploy HAproxy or Nginx with WAF like capabilities like the aforementioned &lt;a href="https://github.com/owasp-modsecurity/ModSecurity" rel="noopener noreferrer"&gt;ModSecurity&lt;/a&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Option #4 – AWS native service – Web Application Firewall
&lt;/h4&gt;

&lt;p&gt;Our fourth option is AWS Web Application Firewall, a fully managed service. With this option you pay no license fees, only for what you use, and the WAF component can also be scaled and managed as with option #3, decoupled from the application layer. Some of the main benefits are that AWS WAF is tightly integrated with AWS services such as Amazon Cloudfront, AWS Application Load Balancer, AWS API Gateway, AWS Appsync and AWS Shield for DDoS protection. It’s relatively straightforward to set up and deploy and builders already familiar with AWS won’t have to learn something new (or relate to a 3rd party provider with possibly sub-optimal licensing agreements).&lt;/p&gt;

&lt;p&gt;AWS WAF also supports Bot Control that provides visibility and control over common and pervasive bot traffic that can consume excess resources or lead to downtime and Fraud Control which can protect login and sign-up pages against attacks such as credential stuffing, credential cracking and fake account creation.&lt;/p&gt;

&lt;p&gt;Relevant rules are configured in ordered priority, like traditional firewall rules, and you can choose from ALLOW, BLOCK or COUNT actions to achieve the desired behaviour.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ALLOW all requests except the ones that you prefer&lt;/li&gt;
&lt;li&gt;BLOCK all requests except the ones that you prefer&lt;/li&gt;
&lt;li&gt;COUNT requests that match certain criterias&lt;/li&gt;
&lt;li&gt;CAPTCHA or challenge checks against requests that match certain criterias&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  AWS native service – WAF – Regional deployment
&lt;/h5&gt;

&lt;p&gt;In this scenario AWS WAF is configured for Application Load Balancers, Amazon API Gateways or AWS AppSync at the regional level.&lt;/p&gt;

&lt;p&gt;AWS Shield L3+L4 standard protection is included without additional charges, but AWS Shield Advanced (L7 protection) is not supported.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffl3b3098l95q1j1py84u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffl3b3098l95q1j1py84u.png" width="800" height="519"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h5&gt;
  
  
  AWS native service – WAF – Global edge network
&lt;/h5&gt;

&lt;p&gt;In this scenario AWS WAF is configured for Amazon Cloudfront at the global edge network level.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Futnj15xtdeojhmvmnyt5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Futnj15xtdeojhmvmnyt5.png" width="800" height="438"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Mitigation of large scale attacks is most efficient the further “out” you get, because global network capacity combined is larger than regional capacity, so moving from a regional perimeter to the AWS global edge network is highly recommended. By adopting WAF with Cloudfront you can get full &lt;a href="https://aws.amazon.com/shield/" rel="noopener noreferrer"&gt;AWS Shield&lt;/a&gt; DDoS protection (Standard for L3/4 or Advanced for L3/4+7 for mission critical workloads) and provide AWS the optimal preconditions for mitigation. Malicious requests can be blocked even before it reaches the region.&lt;/p&gt;

&lt;h3&gt;
  
  
  Introduction to how AWS WAF works
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0ih0f3yrmbgdquripib.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0ih0f3yrmbgdquripib.png" width="800" height="604"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Define a Web Access Control List (Web ACL) configured to protect a set of AWS resources (such as Amazon Cloudfront or AWS Application Load Balancer).&lt;/li&gt;
&lt;li&gt;Specify your desired actions as rule statements. These can be custom and specified by you, managed by the AWS Threat Research Team or a 3rd party vendor from the AWS Marketplace.

&lt;ul&gt;
&lt;li&gt;Each rule consists of a condition and an action. Example: if request origin country is this value, then BLOCK the request.&lt;/li&gt;
&lt;li&gt;A rule can be rate based, IP allow/deny, geoblocking at country level and so on.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Organize re-usable rules in Rule Groups that can be attached to multiple WebACLs.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Rules are evaluated from the lowest numeric priority (1) and up until rule match that terminates the evaluation, or all rules are evaluated without match.&lt;/p&gt;

&lt;p&gt;To calculate the complexity and evaluation of the total combination of rules and rule groups, each Web ACL has a concept called &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-waf-capacity-units.html" rel="noopener noreferrer"&gt;Web Capacity Units (WCU)&lt;/a&gt; which is limited to maximum 5000 WCUs per Web ACL or Rule Group.&lt;/p&gt;

&lt;p&gt;If your company has subscribed to AWS Shield Advanced, the service will add an additional Rule Group managed by the AWS Shield Response Team for tailored mitigation.&lt;/p&gt;

&lt;p&gt;Here is an overview of some of the Managed Rule Groups available for configuration in the AWS Console:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS Managed rule groups&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhs8sgbr4jh2aqk7sig6d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhs8sgbr4jh2aqk7sig6d.png" width="705" height="1024"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1xlev1ewmn9nhnkiwd6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm1xlev1ewmn9nhnkiwd6.png" width="800" height="847"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  AWS WAF Traffic dashboard insight
&lt;/h4&gt;

&lt;p&gt;Having visibility into incoming traffic is paramount for successful mitigation and to ensure valid traffic is not impacted by mistake. AWS WAF WebACL logs can be configured to be shipped to an Amazon CloudWatch Logs log group or an Amazon S3 bucket which enables easy querying with CloudWatch Logs Insights and/or Amazon Athena. In addition an Amazon Data Firehouse delivery stream destination can also be set up for further processing or shipping to a 3rd party log analysis solution such as Splunk.&lt;/p&gt;

&lt;p&gt;AWS Web Application Firewall is of course set up for my blog so I have the opportunity to share some recent insights from the last seven days since publication of this post.&lt;/p&gt;

&lt;p&gt;The graph below illustrates the distribution between Allowed and Blocked requests. At this point in time I have no Count actions configured.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F596b6x495igad5pr1bbg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F596b6x495igad5pr1bbg.png" width="800" height="463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The graph below illustrates the types of attacks identified in the requests. The majority is of type NoUserAgent and there are just a few BadBots.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl0iuy3jzy9d2g2wwbgwm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl0iuy3jzy9d2g2wwbgwm.png" width="800" height="498"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The graph below illustrates the ten most common rule labes added to incoming requests.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fytv4cv9fhsuebvqwumn4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fytv4cv9fhsuebvqwumn4.png" width="800" height="484"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In addition to useful graphs on the Traffic overview tab you can also easily query the WAF logs directly from the CloudWatch Log Insights tab.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvoixve0r0c4n8fw7ko7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvoixve0r0c4n8fw7ko7.png" width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is a sample of some of the information available in the request logs which you can base your rule logic on, with some details redacted or modified. For full description of all available log fields see &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/logging-fields.html" rel="noopener noreferrer"&gt;AWS WAF Developer Guide – Log fields&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We can see that action: ALLOW and terminatingRuleId: Default_Action, so the request was permitted and passed on to the backend.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;action: ALLOW
httpRequest.clientIp: 123.45.67.809
httpRequest.country: NO
httpRequest.headers.10.value: https://hedrange.com/
httpRequest.headers.4.value: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36 Edg/125.0.0.0
httpRequest.uri: /wp-content/uploads/2023/10/2023-10-oidc-feat.png
ruleGroupList.0.ruleGroupId: AWS#AWSManagedRulesAmazonIpReputationList
ruleGroupList.1.ruleGroupId: AWS#AWSManagedRulesWordPressRuleSet
ruleGroupList.2.ruleGroupId: AWS#AWSManagedRulesCommonRuleSet
terminatingRuleId: Default_Action
terminatingRuleType: REGULAR
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CloudWatch Logs query below searches the selected CloudWatch Logs Groups containing the WAF logs for country, action, URI and terminating rule ID for BLOCKed requests during the last week, limited to 50 results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  fields httpRequest.country, action, httpRequest.uri, terminatingRuleId
| filter action = "BLOCK"
| sort @timestamp desc
| limit 50
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;httpRequest.country&lt;/th&gt;
&lt;th&gt;action&lt;/th&gt;
&lt;th&gt;httpRequest.uri&lt;/th&gt;
&lt;th&gt;terminatingRuleId&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UA&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/archivarix.cms.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesAmazonIpReputationList&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FR&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesAmazonIpReputationList&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UA&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-content/themes/sketch/404.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PH&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/xmlrpc.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesWordPressRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PH&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/xmlrpc.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesWordPressRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/xmlrpc.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesWordPressRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/xmlrpc.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesWordPressRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/xmlrpc.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesWordPressRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/xmlrpc.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesWordPressRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CA&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/robots.txt&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CN&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/xmlrpc.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesWordPressRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-content/plugins/index.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/css/sgd.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/revision.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/.well-known/admin.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-content/install.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-includes/Requests/dropdown.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-includes/pomo/fgertreyersd.php.suspected&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-includes/sts.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/google.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-content/uploads/error_log.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/db.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-includes/pomo/wp-login.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-admin/js/privacy-tools.min.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/autoload_classmap.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/link.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/ws.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/doc.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-admin/js/widgets/cong.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-includes/rest-api/endpoints/html.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-content/uploads/wp-login.php.suspected&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/01.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-content/uploads/cong.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/.well-known//.well-known/owlmailer.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-includes/js/tinymce/skins/wordpress/images/index.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/worm0.PhP7&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/user.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/edit.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/wp-includes/js/tinymce/skins/lightgray/img/index.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE&lt;/td&gt;
&lt;td&gt;BLOCK&lt;/td&gt;
&lt;td&gt;/options.php&lt;/td&gt;
&lt;td&gt;AWS-AWSManagedRulesCommonRuleSet&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As we can see there are many good examples here of potential malicious URIs which are blocked before reaching the application – even though the resources does not exist on the web server in question, this is just random HTTP request guessing from malicious actors.&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS WAF provisioning with Terraform
&lt;/h2&gt;

&lt;p&gt;For demonstration reference is made to my blog post &lt;a href="https://hedrange.com/2024/05/03/develop-lightweight-and-secure-rest-apis-with-aws-lambda-function-url-and-terraform/" rel="noopener noreferrer"&gt;Develop lightweight and secure REST APIs with AWS Lambda Function URL and Terraform&lt;/a&gt; which includes Terraform resource configurations for AWS WAF configured for Amazon Cloudfront.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;As a first step we define a WAFv2 WebACL with a default action of ALLOW.&lt;/li&gt;
&lt;li&gt;The first rule is the AWS Managed Rule &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-ip-rep.html" rel="noopener noreferrer"&gt;AWSManagedRulesAmazonIpReputationList&lt;/a&gt;(WCU: 25) with priority 1.

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;The Amazon IP reputation list rule group contains rules that are based on Amazon internal threat intelligence. This is useful if you would like to block IP addresses typically associated with bots or other threats. Blocking these IP addresses can help mitigate bots and reduce the risk of a malicious actor discovering a vulnerable application (&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-ip-rep.html" rel="noopener noreferrer"&gt;reference&lt;/a&gt;).&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;The second rule is the AWS Managed Rule &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-use-case.html#aws-managed-rule-groups-use-case-wordpress-app" rel="noopener noreferrer"&gt;AWSManagedRulesWordPressRuleSet&lt;/a&gt; (WCU: 100) with priority 2.

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;The WordPress application rule group contains rules that block request patterns associated with the exploitation of vulnerabilities specific to WordPress sites. You should evaluate this rule group if you are running WordPress. This rule group should be used in conjunction with the &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-use-case.html#aws-managed-rule-groups-use-case-sql-db" rel="noopener noreferrer"&gt;SQL database&lt;/a&gt; and &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-use-case.html#aws-managed-rule-groups-use-case-php-app" rel="noopener noreferrer"&gt;PHP application&lt;/a&gt; rule groups (&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-use-case.html#aws-managed-rule-groups-use-case-wordpress-app" rel="noopener noreferrer"&gt;reference)&lt;/a&gt;.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;The third rule is the AWS Managed Rule &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-baseline.html#aws-managed-rule-groups-baseline-known-bad-inputs" rel="noopener noreferrer"&gt;AWSManagedRulesKnownBadInputsRuleSet&lt;/a&gt; (WCU: 200) with priority 3.

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;The Known bad inputs rule group contains rules to block request patterns that are known to be invalid and are associated with exploitation or discovery of vulnerabilities. This can help reduce the risk of a malicious actor discovering a vulnerable application (&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-baseline.html#aws-managed-rule-groups-baseline-known-bad-inputs" rel="noopener noreferrer"&gt;reference&lt;/a&gt;).&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;The fourth rule is the AWS Managed Rule “&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-baseline.html#aws-managed-rule-groups-baseline-crs" rel="noopener noreferrer"&gt;AWSManagedRulesCommonRuleSet&lt;/a&gt;” (WCU: 700) with priority 4.

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;The core rule set (CRS) rule group contains rules that are generally applicable to web applications. This provides protection against exploitation of a wide range of vulnerabilities, including some of the high risk and commonly occurring vulnerabilities described in OWASP publications such as &lt;a href="https://owasp.org/www-project-top-ten/" rel="noopener noreferrer"&gt;OWASP Top 10&lt;/a&gt;. Consider using this rule group for any AWS WAF use case (&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-baseline.html#aws-managed-rule-groups-baseline-crs" rel="noopener noreferrer"&gt;reference&lt;/a&gt;).&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Then we define a Cloudfront distribution.&lt;/li&gt;
&lt;li&gt;Configure AWS WAF WebACL for Cloudfront.&lt;/li&gt;
&lt;/ol&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Step #1 - Create a Web ACL
resource "aws_wafv2_web_acl" "lambda_function_url_demo" {
  #checkov:skip=CKV2_AWS_31: WAF2 logging configuration not necessary for this use-case.
  count = var.provision_cloudfront == true ? 1 : 0
  provider = aws.us-east-1
  name = "lambda_function_url_demo"
  description = "Web ACL with managed rule groups for lambda_function_url_demo"
  scope = "CLOUDFRONT"

  default_action {
    allow {}
  }
# Step 2 - First rule
  rule {
    name = "AWSManagedRulesAmazonIpReputationList"
    priority = 1

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name = "AWSManagedRulesAmazonIpReputationList"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name = "AWSManagedRulesAmazonIpReputationList"
      sampled_requests_enabled = false
    }
  }
# Step 3 - Second rule
  rule {
    name = "AWSManagedRulesKnownBadInputsRuleSet"
    priority = 3

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name = "AWSManagedRulesKnownBadInputsRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name = "AWSManagedRulesKnownBadInputsRuleSet"
      sampled_requests_enabled = false
    }
  }
# Step 4 - Third rule
  rule {
    name = "AWSManagedRulesKnownBadInputsRuleSet"
    priority = 3

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name = "AWSManagedRulesKnownBadInputsRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name = "AWSManagedRulesKnownBadInputsRuleSet"
      sampled_requests_enabled = false
    }
  }
# Step 5 - Fourth rule  
  rule {
    name = "AWSManagedRulesCommonRuleSet"
    priority = 4

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name = "AWSManagedRulesCommonRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name = "AWSManagedRulesCommonRuleSet"
      sampled_requests_enabled = false
    }
  }

  visibility_config {
    cloudwatch_metrics_enabled = true
    metric_name = "web-acl-lambda-function-url-demo"
    sampled_requests_enabled = true
  }
}
# Step 6 - Define Cloudfront distribution resource
resource "aws_cloudfront_distribution" "lambda_function_url_demo" {
  #checkov:skip=CKV_AWS_310: Origin failover is not required for this use-case.
  #checkov:skip=CKV2_AWS_42: Custom SSL certificate is not required for this use-case.
  #checkov:skip=CKV2_AWS_32: Response headers policy not required.
  #checxkov:skip=CKV_AWS_68: WAF to come
  #checxkov:skip=CKV_AWS_111: WAF to come
  #checkov:skip=CKV2_AWS_47: WAF to come
  count = var.provision_cloudfront == true ? 1 : 0
  provider = aws.us-east-1
  origin {
    domain_name = local.lambda_function_url_demo_domain_name
    origin_access_control_id = aws_cloudfront_origin_access_control.cloudfront_oac_lambda_url[0].id
    origin_id = local.lambda_function_origin_id

    custom_origin_config {
      http_port = 80
      https_port = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols = ["TLSv1.2"]
      origin_keepalive_timeout = 5
      origin_read_timeout = 30
    }
  }

  enabled = true
  is_ipv6_enabled = true
  default_root_object = "index.html"
  price_class = "PriceClass_200"

  logging_config {
    include_cookies = false
    bucket = module.cloudfront_logs[0].s3_bucket_bucket_domain_name
    prefix = "lambda_function_url_demo"
  }

  default_cache_behavior {
    allowed_methods = ["HEAD", "DELETE", "POST", "GET", "OPTIONS", "PUT", "PATCH"]
    cached_methods = ["GET", "HEAD", "OPTIONS"]
    target_origin_id = local.lambda_function_origin_id

    forwarded_values {
      query_string = true

      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl = 0
    default_ttl = 0
    max_ttl = 86400
    compress = true
  }

  restrictions {
    geo_restriction {
      restriction_type = "none"
      locations = []
    }
  }

  tags = {
    Name = "LambdaFunctionUrlDemo"
  }

  viewer_certificate {
    cloudfront_default_certificate = true
    minimum_protocol_version = "TLSv1.2_2018"
  }
# Step 7 - Configure AWS WAF WebACL for Cloudfront
  web_acl_id = aws_wafv2_web_acl.lambda_function_url_demo[0].arn
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full sample code including logging configuration is available at &lt;a href="https://github.com/haakond/terraform-aws-lambda-function-url/blob/main/waf.tf" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-lambda-function-url/blob/main/waf.tf&lt;/a&gt; and &lt;a href="https://github.com/haakond/terraform-aws-lambda-function-url/blob/main/cloudfront.tf" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-lambda-function-url/blob/main/cloudfront.tf&lt;/a&gt;. Check out the &lt;a href="https://github.com/haakond/terraform-aws-lambda-function-url/blob/main/README.md" rel="noopener noreferrer"&gt;README.md&lt;/a&gt; for how to deploy.&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS WAF pricing
&lt;/h2&gt;

&lt;p&gt;AWS charges 1) per Web ACL, 2) the amount of rules configured and 3) the amount requests processed. The more WCUs your configuration consumes the higher the cost.&lt;/p&gt;

&lt;p&gt;More advanced capabilities such as Bot Control and Fraud Control have additional subscription and processing costs.&lt;/p&gt;

&lt;p&gt;You can also subscribe to Managed Rules from 3rd party provides from the AWS Marketplace, which will be billed separately.&lt;/p&gt;

&lt;p&gt;For full insight and scenario examples study &lt;a href="https://aws.amazon.com/waf/pricing/" rel="noopener noreferrer"&gt;https://aws.amazon.com/waf/pricing/&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion and recommendations
&lt;/h2&gt;

&lt;p&gt;In this blog post we explored Web Application Firewall as a concept and considered different implementation options. We reviewed AWS WAF as a managed service and explored relevant rules, traffic analysis and logging.&lt;/p&gt;

&lt;p&gt;To get up and running with WAF I recommend to start simple and only choose relevant rules applicable for your type of workload; application, operating system and compute option.&lt;/p&gt;

&lt;p&gt;Align with your security department about applicable policies and protection mechanisms to adhere to. The more complexity you add to WAF the more intensive traffic analysis will be, which in turn increases the costs.&lt;/p&gt;

&lt;p&gt;To reduce rule evaluation (and cost), add the widest and most probable rules (lowest WCU) to be executed first and the most narrow or heavy (highest WCU) ones at last. Basic price for a Web ACL includes up to 1500 WCUs, so try to stay below to avoid extra charges. .&lt;/p&gt;

&lt;p&gt;When adding new rules, choose action type COUNT first and observe the WAF logs for a reasonable period of time to ensure valid traffic is not impacted, before switching to BLOCK.&lt;/p&gt;

&lt;h2&gt;
  
  
  References and additional resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/infrastructure-protection.html" rel="noopener noreferrer"&gt;AWS Well-Architected Framework – Security Pillar whitepaper – Infrastructure protection&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/wafv2_web_acl" rel="noopener noreferrer"&gt;Terraform AWS provider – wafv2_web_acl&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/what-is-aws-waf.html" rel="noopener noreferrer"&gt;AWS WAF, AWS Firewall Manager, and AWS Shield Advanced – Developer Guide&lt;/a&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-list.html" rel="noopener noreferrer"&gt;AWS Managed Rules rule groups list&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/logging.html" rel="noopener noreferrer"&gt;Logging AWS WAF web ACL traffic&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/web-acl-testing.html" rel="noopener noreferrer"&gt;Testing and tuning your AWS WAF protections&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://www.youtube.com/watch?v=PCqx7MQsmwY" rel="noopener noreferrer"&gt;AWS re:Inforce 2023 – Building a secure perimeter with AWS WAF (NIS224)&lt;/a&gt; (15 minutes)&lt;/li&gt;

&lt;li&gt;

&lt;a href="https://www.youtube.com/watch?v=KpAao6ox-cM" rel="noopener noreferrer"&gt;AWS re:Invent 2023 – Safeguarding infrastructure from DDoS attacks with AWS edge services (NET201)&lt;/a&gt; (53 minutes)&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2024/05/23/protect-your-webapps-from-malicious-traffic-with-aws-web-application-firewall/" rel="noopener noreferrer"&gt;Protect your webapps from malicious traffic with AWS Web Application Firewall&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>aws</category>
      <category>ddos</category>
      <category>security</category>
    </item>
    <item>
      <title>Develop lightweight and secure REST APIs with AWS Lambda Function URL and Terraform</title>
      <dc:creator>Håkon Eriksen Drange</dc:creator>
      <pubDate>Fri, 03 May 2024 12:55:52 +0000</pubDate>
      <link>https://forem.com/haakoned/develop-lightweight-and-secure-rest-apis-with-aws-lambda-function-url-and-terraform-3i0o</link>
      <guid>https://forem.com/haakoned/develop-lightweight-and-secure-rest-apis-with-aws-lambda-function-url-and-terraform-3i0o</guid>
      <description>&lt;p&gt;Table of contents&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;
What an AWS Lambda Function URL is and how it differs from a regular AWS Lambda Function

&lt;ul&gt;
&lt;li&gt;Caveats&lt;/li&gt;
&lt;li&gt;Verification of Origin Access Control&lt;/li&gt;
&lt;li&gt;Additional protection with AWS Web Application Firewall&lt;/li&gt;
&lt;li&gt;Full code example&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Conclusion&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;When it comes to developing REST APIs on AWS there’s a lot of options. A traditional approach is to take care of the application layer with in-house business logic based on company tech stack preferences for programming languages and frameworks, deployed on compute resources like EC2 or ECS/EKS behind Application Load Balancers.&lt;/p&gt;

&lt;p&gt;Another approach in a more distributed world is to offload the routing logic to a managed service such as Amazon API Gateway. This can remove a lot of heavy lifting and logic in your application layer so that developers can focus more on core business logic and modularization.&lt;/p&gt;

&lt;p&gt;But sometimes developers only need to expose very simple functionality through an HTTPS endpoint. API Gateway might seem to complex and perhaps more advanced functionality like authentication, routing and throttling is not necessary to get something quickly up and running. For these situations AWS Lambda Function URL could be a feature worth exploring.&lt;/p&gt;

&lt;h2&gt;
  
  
  What an AWS Lambda Function URL is and how it differs from a regular AWS Lambda Function
&lt;/h2&gt;

&lt;p&gt;Lambda functions can be invoked from a number of AWS services such as DynamoDB Streams, SQS, Kinesis and so on.&lt;/p&gt;

&lt;p&gt;Current &lt;em&gt;direct&lt;/em&gt; &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-invocation.html" rel="noopener noreferrer"&gt;Lambda invocation methods&lt;/a&gt; are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Lambda console&lt;/li&gt;
&lt;li&gt;The AWS SDK&lt;/li&gt;
&lt;li&gt;The Invoke API&lt;/li&gt;
&lt;li&gt;AWS CLI&lt;/li&gt;
&lt;li&gt;Function URL HTTPS endpoint&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The method Function URL enables a Lambda function to be invoked by an HTTPS endpoint in the format of &lt;code&gt;https://&amp;lt;url-id&amp;gt;.lambda-url.&amp;lt;region&amp;gt;.on.aws&lt;/code&gt;, in addition to the traditional invocation methods.&lt;/p&gt;

&lt;p&gt;Access can be controlled with the AuthType parameter combined with resource-based policies.&lt;/p&gt;

&lt;p&gt;To only provide access to authenticated users and roles developer can configure &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/urls-auth.html#urls-auth-iam" rel="noopener noreferrer"&gt;AuthType AWS_IAM&lt;/a&gt;. Each HTTP request is &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/urls-invocation.html#urls-invocation-basics" rel="noopener noreferrer"&gt;signed&lt;/a&gt;using &lt;a href="https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html" rel="noopener noreferrer"&gt;AWS Signature Version 4 (SigV4)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For unauthenticated access to anyone specify &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/urls-auth.html#urls-auth-none" rel="noopener noreferrer"&gt;&lt;code&gt;AuthType NONE&lt;/code&gt;&lt;/a&gt;. Do take into consideration that Lambda URL itself does not provide throttling or protection capabilities (described in more details below).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjk3bibt4kosyywuom9ce.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjk3bibt4kosyywuom9ce.png" width="800" height="319"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now follows some Terraform code for demonstration purposes. To keep it as simple as possible we deploy an AWS Lambda function based on &lt;a href="https://github.com/terraform-aws-modules/terraform-aws-lambda" rel="noopener noreferrer"&gt;terraform-aws-modules/terraform-aws-lambda&lt;/a&gt;. &lt;code&gt;create_lambda_function_url&lt;/code&gt; is set to &lt;code&gt;true&lt;/code&gt;and &lt;code&gt;authorization_type&lt;/code&gt; to &lt;code&gt;NONE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Python source code is intentionally left out at this stage, a fully working code example is referenced in the conclusion.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# AWS Lambda Function with endpoint URL
module "lambda_function_url_demo" {
  source = "git::https://github.com/terraform-aws-modules/terraform-aws-lambda.git?ref=f7866811bc1429ce224bf6a35448cb44aa5155e7"

  function_name = "lambda-function-url-demo"
  description = "Lambda Function URL Demo"
  handler = "index.lambda_handler"
  runtime = "python3.12"
  source_path = "./src/lambda-function-url-demo/index.py"
  create_lambda_function_url = true
  authorization_type = "NONE"
  timeout = 30
  cors = {
    allow_credentials = true
    allow_origins = ["*"]
    allow_methods = ["*"]
    allow_headers = ["date", "keep-alive"]
    expose_headers = ["keep-alive", "date"]
    max_age = 60
  }

  tags = {
    Name = "LambdaFunctionUrlDemo"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;output "lambda_function_url_demo_arn" {
  value = module.lambda_function_url_demo.lambda_function_arn
  description = "Lambda Function URL Demo ARN"
  sensitive = false
}

output "lambda_function_url_demo_url" {
  value = module.lambda_function_url_demo.lambda_function_url
  description = "Lambda Function URL Demo URL"
  sensitive = false
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;lambda_function_url_demo_arn = “arn:aws:lambda:eu-west-1:1234567890:function:lambda-function-url-demo”&lt;/p&gt;

&lt;p&gt;lambda_function_url_demo_url = “&lt;a href="https://ytnvqv4vyv5jdhaj4xumgtgd4e0ggowg.lambda-url.eu-west-1.on.aws/" rel="noopener noreferrer"&gt;https://ytnvqv4vyv5jdhaj4xumgtgd4e0ggowg.lambda-url.eu-west-1.on.aws/&lt;/a&gt;“&lt;/p&gt;

&lt;p&gt;The AWS Lambda Function can now be accessed by the endpoint URL output from &lt;code&gt;terraform apply&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Caveats
&lt;/h3&gt;

&lt;p&gt;Lambda Function URLs are designed to be a simple building block and by itself does not support throttling, API Token authentication and management, Web Application Firewall (WAF) or DDoS protection.&lt;/p&gt;

&lt;p&gt;However, this is where Amazon Cloudfront and friends come to assist.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1k1y5lepav4m8v63whi5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1k1y5lepav4m8v63whi5.png" width="800" height="518"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With this approach the perimeter is moved from regional to global endpoints. This means we can benefit from the global scale and network acceleration of Cloudfront which includes AWS Shield Standard for L3/L4 DDoS Protection (you can subscribe to Shield Advanced for L7 protection, automated mitigation and Shield Response Team support). In combination with Web Application Firewall malicious traffic can be mitigated and dropped at the edge to protect our origin from unwanted invocations. This is not only beneficial from a security point of view but it also keeps costs under control.&lt;/p&gt;

&lt;p&gt;April 11th 2024 AWS announced support for Origin Access Control for Lambda function URL origins. The Terraform AWS Provider added support for this in &lt;a href="https://github.com/hashicorp/terraform-provider-aws/releases/tag/v5.46.0" rel="noopener noreferrer"&gt;v5.46.0&lt;/a&gt; which was released released April 19th 2024.&lt;/p&gt;

&lt;p&gt;Changelog: resource/aws_cloudfront_origin_access_control: Add &lt;code&gt;lambda&lt;/code&gt; and &lt;code&gt;mediapackagev2&lt;/code&gt; as valid values for &lt;code&gt;origin_access_control_origin_type&lt;/code&gt; (&lt;a href="https://github.com/hashicorp/terraform-provider-aws/issues/34362" rel="noopener noreferrer"&gt;#34362&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Our Lambda function will now look like this. On line 11 authorization_type is changed from “NONE” to “AWS_IAM”. From line 27 a Lambda permission resource is added which grants the Cloudfront distribution permissions to invoke the function. Lines 35 and throughout defines basic properties of an AWS Cloudfront distribution.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# AWS Lambda Function
module "lambda_function_url_demo" {
  source = "git::https://github.com/terraform-aws-modules/terraform-aws-lambda.git?ref=f7866811bc1429ce224bf6a35448cb44aa5155e7"

  function_name = "lambda-function-url-demo"
  description = "Lambda Function URL Demo"
  handler = "index.lambda_handler"
  runtime = "python3.12"
  source_path = "./src/lambda-function-url-demo/index.py"
  create_lambda_function_url = true
  authorization_type = "AWS_IAM"
  timeout = 30
  cors = {
    allow_credentials = true
    allow_origins = ["*"]
    allow_methods = ["*"]
    allow_headers = ["date", "keep-alive"]
    expose_headers = ["keep-alive", "date"]
    max_age = 60
  }

  tags = {
    Name = "LambdaFunctionUrlDemo"
  }
}

resource "aws_lambda_permission" "allow_cloudfront" {
  statement_id = "AllowCloudFrontServicePrincipal"
  action = "lambda:InvokeFunctionUrl"
  function_name = module.lambda_function_url_demo.lambda_function_name
  principal = "cloudfront.amazonaws.com"
  source_arn = aws_cloudfront_distribution.lambda_function_url_demo[0].arn
}

resource "aws_cloudfront_distribution" "lambda_function_url_demo" {
  provider = aws.us-east-1
  origin {
    domain_name = local.lambda_function_url_demo_domain_name
    origin_access_control_id = aws_cloudfront_origin_access_control.cloudfront_oac_lambda_url[0].id
    origin_id = local.lambda_function_origin_id

    custom_origin_config {
      http_port = 80
      https_port = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols = ["TLSv1.2"]
      origin_keepalive_timeout = 5
      origin_read_timeout = 30
    }
  }

  enabled = true
  is_ipv6_enabled = true
  default_root_object = "index.html"
  price_class = "PriceClass_200"

  logging_config {
    include_cookies = false
    bucket = module.cloudfront_logs[0].s3_bucket_bucket_domain_name
    prefix = "lambda_function_url_demo"
  }

  default_cache_behavior {
    allowed_methods = ["HEAD", "DELETE", "POST", "GET", "OPTIONS", "PUT", "PATCH"]
    cached_methods = ["GET", "HEAD", "OPTIONS"]
    target_origin_id = local.lambda_function_origin_id

    forwarded_values {
      query_string = true

      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl = 0
    default_ttl = 0
    max_ttl = 86400
    compress = true
  }

  restrictions {
    geo_restriction {
      restriction_type = "none"
      locations = []
    }
  }

  tags = {
    Name = "LambdaFunctionUrlDemo"
  }

  viewer_certificate {
    cloudfront_default_certificate = true
    minimum_protocol_version = "TLSv1.2_2018"
  }
  web_acl_id = aws_wafv2_web_acl.lambda_function_url_demo.arn
}

# Amazon Cloudfront distribution OAC
resource "aws_cloudfront_origin_access_control" "cloudfront_oac_lambda_url" {
  name = "cloudfront_oac_lambda_url"
  description = "Policy for Lambda Function URL origins"
  origin_access_control_origin_type = "lambda"
  signing_behavior = "always"
  signing_protocol = "sigv4"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Verification of Origin Access Control
&lt;/h4&gt;

&lt;p&gt;For verification we can observe incoming requests in CloudWatch Logs. For unauthenticated requests directly to the Lambda URL headers &lt;code&gt;x-amz-content-sha256&lt;/code&gt; and &lt;code&gt;x-amz-security-token&lt;/code&gt; are absent.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[INFO]  2024-04-26T13:35:48.597Z    4f4c2779-05cb-439b-ba55-b50415679622    {
    "version": "2.0",
    "routeKey": "$default",
    "rawPath": "/index.html",
    "rawQueryString": "input1=YES",
    "headers": {
        "x-amz-content-sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
        "x-amzn-tls-version": "TLSv1.2",
        "sec-fetch-site": "same-origin",
        "x-amz-source-account": "602472554111",
        "x-forwarded-port": "443",
        "sec-fetch-user": "?1",
        "x-amz-security-token": "IQoJb3JpZ2luX2VjEHYaCXVzLWVhc3QtMSJHMEUCIA2gGi8zMAO+5h2fGL65nksRPq51Ks3y3RL5o/SfleOiAiEA6om0TmJJe3uFRO54DCL9DV6/eGxcr+KGEzNiWblc/r0qlwIIv///////////ARAAGgw4NTYzNjkwNTMxODEiDM9OfrD8gawO+6OZ0irrAegKU5LkXCbnheQAeURiaTIUv9BTjdYeGa7p1gjpPw0N51w/2BB/c0eUae11ONdgdcyK6hCLthfyOay96rx7YTbXhvtVSTWkk7Nz6eAAyttffUv+n5c5+CY2M0bLntNkuisImgKh0RRl1rsTYILXOTqqVlT5+Ipd/yZNPtgXf0NsPJNEOsAtWvSZf4PScxYd9Xlk0CvNDUCk1BZ5afUkLXlhO/T1F2Tu0oaYbIwFxLZngmDc+KEMo82HkocD4VG/fmUtp7x2ln9BwCINkLtg7P4REgsJ9WdNUG647hrkJMcBRHaYePATqPhdYtIwttqusQY6jwFudLv6XtcxTs+Yi8NuweNYVXOvyR9N28zX6OasvJh4p3JseSxXr1Ejsgnhcb9rc40uhHlvwqvuNFdgeXiB+xEiVkDV2KtOEULzVd+bO1Nf4va6WTuob1wPG4W73TAUO+xLaDedVcpp+kQQYmr3I3Dh2m31XUiL7unsacWGVi+6DZiVWLoBb6ZJ3zCabR5jOg==",
        "via": "2.0 fc5e625db631bc657fc73f189d53fa14.cloudfront.net (CloudFront)",
        "x-amzn-tls-cipher-suite": "ECDHE-RSA-AES128-GCM-SHA256",
        "sec-ch-ua-mobile": "?0",
        "upgrade-insecure-requests": "1",
        "host": "ytnvqv4vyv5jdhaj4xumgtgd4e0ggowg.lambda-url.eu-west-1.on.aws",
        "sec-fetch-mode": "navigate",
        "x-amz-date": "20240426T133548Z",
        "x-forwarded-proto": "https",
        "x-forwarded-for": "81.166.192.92",
        "priority": "u=0, i",
        "x-amz-source-arn": "arn:aws:cloudfront::602472554111:distribution/E3RIXKOQDC23IE",
        "sec-ch-ua": "\"Chromium\";v=\"124\", \"Google Chrome\";v=\"124\", \"Not-A.Brand\";v=\"99\"",
        "x-amzn-trace-id": "Root=1-662badb4-7b849839607a46814fd5c1db",
        "sec-ch-ua-platform": "\"Windows\"",
        "accept-encoding": "gzip",
        "x-amz-cf-id": "TEFeqAECknBq5wwXW6rdwZR6os03LoZJyFZjEbjD-VW7ImAl1BdpYg==",
        "user-agent": "Amazon CloudFront",
        "sec-fetch-dest": "document"
    },
    "queryStringParameters": {
        "input1": "YES"
    },
    "requestContext": {
        "accountId": "anonymous",
        "apiId": "ytnvqv4vyv5jdhaj4xumgtgd4e0ggowg",
        "domainName": "ytnvqv4vyv5jdhaj4xumgtgd4e0ggowg.lambda-url.eu-west-1.on.aws",
        "domainPrefix": "ytnvqv4vyv5jdhaj4xumgtgd4e0ggowg",
        "http": {
            "method": "GET",
            "path": "/index.html",
            "protocol": "HTTP/1.1",
            "sourceIp": "64.252.86.126",
            "userAgent": "Amazon CloudFront"
        },
        "requestId": "4f4c2779-05cb-439b-ba55-b50415679622",
        "routeKey": "$default",
        "stage": "$default",
        "time": "26/Apr/2024:13:35:48 +0000",
        "timeEpoch": 1714138548591
    },
    "isBase64Encoded": false
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Direct non-authorized access to the AWS Lambda Function URL is now denied and our lightweight API can only be accessed through Cloudfront with L3/L4 DDoS protection.&lt;/p&gt;

&lt;h4&gt;
  
  
  Additional protection with AWS Web Application Firewall
&lt;/h4&gt;

&lt;p&gt;To stop malicious requests from botnets, SQL injection, cross-site scripting (XSS) and so on we associate a &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/how-aws-waf-works.html" rel="noopener noreferrer"&gt;Web Application Firewall Access Control List&lt;/a&gt; with the Cloudfront distribution. Read more details about AWS WAF, it’s core functionality and aspects to take into consideration at &lt;a href="https://hedrange.com/2024/05/23/protect-your-webapps-from-malicious-traffic-with-aws-web-application-firewall/" rel="noopener noreferrer"&gt;Protect your webapps from malicious traffic with AWS Web Application Firewall.&lt;/a&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
# Web Application Firewall resources

# Common S3 bucket for WAF logs
resource "aws_cloudwatch_log_group" "waf_cloudwatch_logs" {
  #checkov:skip=CKV_AWS_158: KMS encryption unnecessary for this use-case.
  count = var.provision_cloudfront == true ? 1 : 0
  provider = aws.us-east-1
  name = "aws-waf-logs-lambda-function-url-demo"
  retention_in_days = 365
}

resource "aws_wafv2_web_acl_logging_configuration" "waf_cloudwatch_logs_config" {
  count = var.provision_cloudfront == true ? 1 : 0
  provider = aws.us-east-1
  log_destination_configs = [aws_cloudwatch_log_group.waf_cloudwatch_logs[0].arn]
  resource_arn = aws_wafv2_web_acl.lambda_function_url_demo[0].arn
}

resource "aws_cloudwatch_log_resource_policy" "waf_cloudwatch_logs_resource_policy" {
  count = var.provision_cloudfront == true ? 1 : 0
  provider = aws.us-east-1
  policy_document = data.aws_iam_policy_document.waf_logging[0].json
  policy_name = "webacl-policy-waf-lambda-function-url-demo"
}

# Create a Web ACL
resource "aws_wafv2_web_acl" "lambda_function_url_demo" {
  #checkov:skip=CKV2_AWS_31: WAF2 logging configuration not necessary for this use-case.
  count = var.provision_cloudfront == true ? 1 : 0
  provider = aws.us-east-1
  name = "lambda_function_url_demo"
  description = "Web ACL with managed rule groups for lambda_function_url_demo"
  scope = "CLOUDFRONT"

  default_action {
    allow {}
  }

  rule {
    name = "AWSManagedRulesAmazonIpReputationList"
    priority = 1

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name = "AWSManagedRulesAmazonIpReputationList"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name = "AWSManagedRulesAmazonIpReputationList"
      sampled_requests_enabled = false
    }
  }

  rule {
    name = "AWSManagedRulesWordPressRuleSet"
    priority = 2

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name = "AWSManagedRulesWordPressRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name = "AWSManagedRulesWordPressRuleSet"
      sampled_requests_enabled = false
    }
  }

  rule {
    name = "AWSManagedRulesKnownBadInputsRuleSet"
    priority = 3

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name = "AWSManagedRulesKnownBadInputsRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name = "AWSManagedRulesKnownBadInputsRuleSet"
      sampled_requests_enabled = false
    }
  }

  rule {
    name = "AWSManagedRulesCommonRuleSet"
    priority = 4

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name = "AWSManagedRulesCommonRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name = "AWSManagedRulesCommonRuleSet"
      sampled_requests_enabled = false
    }
  }

  visibility_config {
    cloudwatch_metrics_enabled = true
    metric_name = "web-acl-lambda-function-url-demo"
    sampled_requests_enabled = true
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Full code example
&lt;/h3&gt;

&lt;p&gt;To tie together all the bits and pieces I have developed a sample Terraform module which you can inspect further: &lt;a href="https://github.com/haakond/terraform-aws-lambda-function-url" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-lambda-function-url&lt;/a&gt; . Study the &lt;a href="https://github.com/haakond/terraform-aws-lambda-function-url/blob/main/README.md" rel="noopener noreferrer"&gt;README.md&lt;/a&gt; and &lt;a href="https://github.com/haakond/terraform-aws-lambda-function-url/blob/main/examples/main.tf" rel="noopener noreferrer"&gt;examples/main.tf&lt;/a&gt; for complete documentation on how to get up and running, including Lambda Function code in Python. If you find it useful, feel free to fork and adjust to your needs.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;module "lambda_function_url_demo" {
  source = "git::https://github.com/haakond/terraform-aws-lambda-function-url.git?ref=e3c72cb76d4a1d5b5b56e4a56a117f0949002a9d"

  # As global resources related to Cloudfront and WAF needs to be provisioned in us-east-1, we pass in two different providers.
  # Reference: https://developer.hashicorp.com/terraform/language/modules/develop/providers#passing-providers-explicitly
  provision_cloudfront = false # Set to false on the first run, set to true on the second run because of circular resources dependencies with Lambda and Cloudfront.
  providers = {
    aws = aws
    aws.us-east-1 = aws.us-east-1
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this article we explored how AWS Lambda Function URL can be a compelling alternative to self-hosted APIs with Application Load Balancers and Amazon API Gateway for more lightweight and simple REST API use-cases. We looked at how we can secure the solution by adopting Amazon Cloudfront and AWS Web Application Firewall. As a bonus custom domain names are also possible with Amazon Certificate Manager support in Cloudfront. The solution is fully serverless and has no servers or containers to patch or manage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Further reading&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Amazon CloudFront now supports Origin Access Control (OAC) for Lambda function URL origins&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-lambda.html" rel="noopener noreferrer"&gt;Amazon CloudFront Developer Guide – Restricting access to an AWS Lambda Function URL origin&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-urls.html" rel="noopener noreferrer"&gt;AWS Lambda Developer Guide – Lambda Function URLs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://registry.terraform.io/modules/terraform-aws-modules/lambda/aws/latest" rel="noopener noreferrer"&gt;AWS Lambda Terraform module&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-awswaf.html" rel="noopener noreferrer"&gt;Amazon Cloudfront – Developer Guide – Using AWS WAF protections&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudfront_origin_access_control" rel="noopener noreferrer"&gt;Terraform registry – Resource: aws_cloudfront_origin_access_control&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/haakond/terraform-aws-lambda-function-url/" rel="noopener noreferrer"&gt;https://github.com/haakond/terraform-aws-lambda-function-url/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The post &lt;a href="https://hedrange.com/2024/05/03/develop-lightweight-and-secure-rest-apis-with-aws-lambda-function-url-and-terraform/" rel="noopener noreferrer"&gt;Develop lightweight and secure REST APIs with AWS Lambda Function URL and Terraform&lt;/a&gt; first appeared on &lt;a href="https://hedrange.com" rel="noopener noreferrer"&gt;Håkon Eriksen Drange&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>articles</category>
      <category>aws</category>
      <category>cloudfront</category>
      <category>lambda</category>
    </item>
  </channel>
</rss>
