<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Thomas Laue</title>
    <description>The latest articles on Forem by Thomas Laue (@tlaue).</description>
    <link>https://forem.com/tlaue</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F142858%2Fc3312653-2191-474f-9d26-28c162ff0cb1.jpg</url>
      <title>Forem: Thomas Laue</title>
      <link>https://forem.com/tlaue</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/tlaue"/>
    <language>en</language>
    <item>
      <title>AWS documentation and service quotas are your friends – do not miss them!</title>
      <dc:creator>Thomas Laue</dc:creator>
      <pubDate>Fri, 06 Oct 2023 14:31:16 +0000</pubDate>
      <link>https://forem.com/fmegroup/aws-documentation-and-service-quotas-are-your-friends-do-not-miss-them-1n43</link>
      <guid>https://forem.com/fmegroup/aws-documentation-and-service-quotas-are-your-friends-do-not-miss-them-1n43</guid>
      <description>&lt;p&gt;Who loves reading docs? Probably not that many developers and engineers. Coding and developing are so much more exciting than spending hours reading tons of documentation. However just recently I was taught again that this is one of the big misconceptions and fallacies -- probably not only for me.&lt;/p&gt;

&lt;p&gt;AWS provides an extensive documentation for each service which contains not just a general overview but, in most cases, a deep knowledge of details and specifies related to an AWS service. Most service documentations consist of hundreds of pages including a lot of examples and code snippets which are quite often helpful -- especially related to IAM policies. It is not always easy to find the relevant pieces for a specific edge case or it might be missing from time to time, but overall AWS have done a great job documenting their landscape.&lt;/p&gt;

&lt;p&gt;Service quotas which exist for every AWS service are another part which should not be missed when either starting to work with a new AWS service or to use one more extensively. Many headaches and lost hours spent to debug an issue could be avoided by taken these quotas into account right from the start. Unfortunately, this lesson is too easy to forget like&lt;br&gt;
will be shown in the following example.&lt;/p&gt;

&lt;p&gt;In a recent project, AWS DataSync was used to move about 40 million files from an AWS EFS share to a S3 bucket. The whole sync process should be repeated from time to time after the initial sync to take new and updated files into account. AWS DataSync supports this scenario by applying an incremental approach after the first run.&lt;/p&gt;

&lt;p&gt;One DataSync location was created for EFS and another one for S3 and both where sticked to gether by an DataSync task which configures among other things the sync properties. The initial run of this task went fine. All files were synced after about 9 hours.&lt;/p&gt;

&lt;p&gt;Some days later a first incremental sync was started to reflect the changes which had happened on EFS since the first run. The task went into the preparation phase but broke after about 30 minutes with a strange error message:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2WTT-UGk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6zka9hs2ri82cpk8gbmg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2WTT-UGk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6zka9hs2ri82cpk8gbmg.png" alt="AWS Cloudtrail extract showing a strange error message related to AWS DataSync" width="800" height="143"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;"Cannot allocate memory" -- what are you trying to tell me? No memory setting was configured in the DataSync task definition as no agent was involved. The first hit on Google shed some light on this problem by redirecting me to the documentation of AWS DataSync&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--y0HvVXSQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/w0n18uymki8fgbr2sx22.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--y0HvVXSQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/w0n18uymki8fgbr2sx22.png" alt="Explanation of error message in AWS DataSync documentation" width="800" height="209"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;which contains a link to the DataSync task quotas:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BTOmCL-6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mmcqpgjz15de4j9v37qi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BTOmCL-6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mmcqpgjz15de4j9v37qi.png" alt="Extract of AWS DataSync quotas" width="800" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Apparently, 40 million files are way too much for one task as only 25 million are supported when transferring files between AWS Storage services. A request to the AWS support confirmed as well that problem was related to the large number of files. I have no idea why the initial run was able to run through but at least the follow up one failed. Splitting up the task into several smaller ones solved this issue so that the incremental run could finally be succeeded as well.&lt;/p&gt;

&lt;p&gt;Nevertheless, some hours were lost even though we learned something new.&lt;/p&gt;

&lt;p&gt;Lessons learned -- again:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Embrace the docs -- even though they are really extensive!&lt;/li&gt;
&lt;li&gt;  Take the service quotas into account before starting to work and while working with an AWS service. They will get relevant one day -- possibly earlier than later!&lt;/li&gt;
&lt;li&gt;  AWS technical support really like to help and is quite competent. Do not hesitate to contact them (if you have a support plan available).&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>cloud</category>
      <category>cloudcomputing</category>
      <category>documentation</category>
    </item>
    <item>
      <title>A cross account cost overview dashboard powered by Lambda, Step Functions, S3 and Quicksight</title>
      <dc:creator>Thomas Laue</dc:creator>
      <pubDate>Fri, 17 Feb 2023 13:01:44 +0000</pubDate>
      <link>https://forem.com/fmegroup/a-cross-account-cost-overview-dashboard-powered-by-lambda-step-functons-s3-and-quicksight-22c1</link>
      <guid>https://forem.com/fmegroup/a-cross-account-cost-overview-dashboard-powered-by-lambda-step-functons-s3-and-quicksight-22c1</guid>
      <description>&lt;p&gt;Keeping an eye on cloud spendings in AWS or any other cloud service&lt;br&gt;
provider is one of the most important parts of every project team's&lt;br&gt;
work. This can be tedious when following the AWS recommendation and best&lt;br&gt;
practice to split workloads over more than one AWS account. Consolidated&lt;br&gt;
billing -- a feature of AWS Organization would help in this case -- but&lt;br&gt;
access to the main/billing account is very often not granted to project&lt;br&gt;
teams and departments for good reason.&lt;/p&gt;

&lt;p&gt;In this article, a solution is presented which allows to automatically&lt;br&gt;
collect billing information from various accounts to present them in a&lt;br&gt;
concise format in one or more AWS Quicksight dashboards. Before starting&lt;br&gt;
to go into the details, let's look shortly on the available AWS services&lt;br&gt;
for cost management and the difficulty when working with more than one&lt;br&gt;
account.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS Cost Management Platform and Consolidated Billing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AWS provides a complete toolset consisting of AWS Billing, Cost&lt;br&gt;
Explorer, and other sub services. They offer functionalities to get a&lt;br&gt;
focused overview about costs and usages as well as tools to drill down&lt;br&gt;
into the details of most services. The tooling is sufficient to get the&lt;br&gt;
job done when working with one AWS account even though the user&lt;br&gt;
experience might not be as awesome as provided by other more dedicated&lt;br&gt;
and focused 3^rd^ party tools. Special IAM permissions are needed to&lt;br&gt;
access all this information to tackle the governance part as well.&lt;/p&gt;

&lt;p&gt;AWS Organization which rules all AWS accounts associated with it&lt;br&gt;
provides some extended functionalities (aka. Consolidated billing) to&lt;br&gt;
centralize cost monitoring and management. However, in most companies&lt;br&gt;
and large enterprises only very few people are allowed to access the&lt;br&gt;
billing account. This does not help a project lead to get the required&lt;br&gt;
information easily. Depending on the number of AWS accounts used by&lt;br&gt;
team, someone who is allowed has either to login into every account&lt;br&gt;
regularly and check the costs or rely on "Cost and Usage" reports which&lt;br&gt;
can be exported automatically to S3. These reports are very detailed&lt;br&gt;
(maybe too much for simple use cases) and require some custom tooling to&lt;br&gt;
extract the required information.&lt;/p&gt;

&lt;p&gt;AWS has published a solution called &lt;a href="https://wellarchitectedlabs.com/cost/200_labs/200_cloud_intelligence/cost-usage-report-dashboards/"&gt;Cloud Intelligence&lt;br&gt;
Dashboards&lt;/a&gt; -- a collection of Quicksight dashboards which are among other data sources based on these cost and usage reports. Beside this one, company internal cost control tools -- sometimes bases on the same AWS services -- exist and can be "rented". All these solutions have their advantages and use cases -- but also drawbacks (mostly related to their costs and sometimes also due to overly large IAM permission requirements).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An approach for simple use cases&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sometimes it is fully sufficient to present some information in a&lt;br&gt;
concise manner to stay informed about the general trends and total&lt;br&gt;
amounts. In case something reveals to be strange or not to be in the&lt;br&gt;
expected range, a more detailed analysis can be performed using for&lt;br&gt;
instance the AWS tooling mentioned above.&lt;/p&gt;

&lt;p&gt;Following this idea, Step Functions, Lambda, S3 and Quicksight powers a&lt;br&gt;
small application which retrieves cost related data for every designated&lt;br&gt;
AWS account and stores it as a JSON file in S3. Quicksight which&lt;br&gt;
supports S3 as a data source directly reads this data and provides it to&lt;br&gt;
build one or more dashboards displaying various cost related diagrams&lt;br&gt;
and tables. This workflow (which is shown below) is triggered by an&lt;br&gt;
EventBridge rule regularly (e.g., once a day) so that up-to-date&lt;br&gt;
information is available.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--flgRn7YP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/xdaxojpy5f5b02zmpgxb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--flgRn7YP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/xdaxojpy5f5b02zmpgxb.png" alt="StepFunctions workflow which coordinates the cost related data" width="427" height="684"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The „Data preparation" step provides the AWS account ids and names as input for the following &lt;em&gt;Map State&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--vIfEMsvy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lbm3d352zrubobplhl87.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vIfEMsvy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lbm3d352zrubobplhl87.png" alt="Input data for Map State" width="553" height="793"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Map State starts for every array element (= AWS account) an&lt;br&gt;
    instance of a Lambda function which assumes an IAM role in the&lt;br&gt;
    relevant account and queries the AWS Cost Center and AWS Budget APIs&lt;br&gt;
    to collect the required information: total current costs, costs per&lt;br&gt;
    service, cost forecasts...&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject_lambda_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log_event&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;capture_lambda_handler&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
     &lt;span class="n"&gt;validate_input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

     &lt;span class="n"&gt;assumed_role_session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;assume_role&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
         &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s"&gt;'arn:aws:iam::&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"account_id"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:role/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ROLE_NAME&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"eu-central-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="p"&gt;)&lt;/span&gt;

     &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;assumed_role_session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ce"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="n"&gt;budget_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;assumed_role_session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"budgets"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

     &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;arrow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
     &lt;span class="n"&gt;first_day_of_current_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_start_and_end_date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

     &lt;span class="n"&gt;cost_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_cost_and_usage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
         &lt;span class="n"&gt;TimePeriod&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
             &lt;span class="s"&gt;"Start"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;first_day_of_current_month&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date_format&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
             &lt;span class="s"&gt;"End"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;end_date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date_format&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
         &lt;span class="p"&gt;},&lt;/span&gt;
         &lt;span class="n"&gt;Granularity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"MONTHLY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="n"&gt;Metrics&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"UnblendedCost"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
     &lt;span class="p"&gt;)&lt;/span&gt;

     &lt;span class="n"&gt;cost_per_service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_cost_and_usage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
         &lt;span class="n"&gt;TimePeriod&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
             &lt;span class="s"&gt;"Start"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;first_day_of_current_month&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date_format&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
             &lt;span class="s"&gt;"End"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;end_date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date_format&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
         &lt;span class="p"&gt;},&lt;/span&gt;
         &lt;span class="n"&gt;Granularity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"MONTHLY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="n"&gt;Metrics&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"UnblendedCost"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
         &lt;span class="n"&gt;GroupBy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="s"&gt;"Type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"DIMENSION"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"SERVICE"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
     &lt;span class="p"&gt;)&lt;/span&gt;

 &lt;span class="err"&gt;…&lt;/span&gt;

     &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="s"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date_format&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
         &lt;span class="s"&gt;"current_month"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="s"&gt;"current_year"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="s"&gt;"account_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"account_name"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;upper&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="s"&gt;"current_costs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"ResultsByTime"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;"Total"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;"UnblendedCost"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;"Amount"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
         &lt;span class="s"&gt;"forecasted_costs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forecasted_cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="s"&gt;"budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;budget_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"Budgets"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;"BudgetLimit"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s"&gt;"Amount"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="s"&gt;"account_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"account_id"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="p"&gt;}&lt;/span&gt;

     &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;service_costs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
         &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;

     &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"statusCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The outcome of the Map State is an array consisting of the cost data retrieved from every AWS account.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2023-02-09"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"current_month"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"current_year"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2023&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"account_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Test"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"current_costs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"152.49"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"forecasted_costs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"468.10"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"budget"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"700.00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"account_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"11111111111x"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"Amazon DynamoDB"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;10.0002276717&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"AWS CloudTrail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.0943441333&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"AWS Lambda"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;30.24534433&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"AWS Key Management Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.8699148608&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"AWS Step Functions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5439809890&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
     &lt;/span&gt;&lt;span class="nl"&gt;"Amazon Relational Database Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;16.1240975116&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="err"&gt;…&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This data is stored as JSON file in a S3 bucket in the last workflow&lt;br&gt;
    step. Every file name contains the current date to make them unique.&lt;/p&gt;

&lt;p&gt;Apart from the StepFunctions workflow and the Lambda function which can&lt;br&gt;
for instance run in a dedicated AWS account to simplify the permission&lt;br&gt;
management, an IAM role needs to be deployed to every account whose&lt;br&gt;
costs should be retrieved. This role must trust the Lambda function and&lt;br&gt;
contain the necessary permissions to query AWS Cost Center and AWS&lt;br&gt;
Budget. An example written in Terraform is given below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"iam_assume_role_sts"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/iam/aws//modules/iam-assumable-role"&lt;/span&gt;
   &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&amp;gt;= 4.7.0"&lt;/span&gt;

   &lt;span class="nx"&gt;role_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"query-cost-center-role"&lt;/span&gt;
   &lt;span class="nx"&gt;trusted_role_arns&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
     &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query_cost_center_lambda_role_arn&lt;/span&gt;
   &lt;span class="p"&gt;]&lt;/span&gt;

   &lt;span class="nx"&gt;create_role&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
   &lt;span class="nx"&gt;role_requires_mfa&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

   &lt;span class="nx"&gt;custom_role_policy_arns&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
     &lt;span class="k"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query_cost_center_policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
   &lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;

 &lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"query_cost_center_policy"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/iam/aws//modules/iam-policy"&lt;/span&gt;
   &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&amp;gt;= 4.7.0"&lt;/span&gt;

   &lt;span class="nx"&gt;name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Query-cost-center-policy"&lt;/span&gt;
   &lt;span class="nx"&gt;path&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"/"&lt;/span&gt;
   &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"This policy allows to query the AWS CostCenter"&lt;/span&gt;
   &lt;span class="nx"&gt;policy&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_iam_policy_document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;query_cost_center_policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;

 &lt;span class="k"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_policy_document"&lt;/span&gt; &lt;span class="s2"&gt;"query_cost_center_policy"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
       &lt;span class="s2"&gt;"ce:GetCostAndUsage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="s2"&gt;"ce:GetCostForecast"&lt;/span&gt;
     &lt;span class="p"&gt;]&lt;/span&gt;

     &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;

   &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
       &lt;span class="s2"&gt;"budgets:ViewBudget"&lt;/span&gt;
     &lt;span class="p"&gt;]&lt;/span&gt;

     &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
       &lt;span class="s2"&gt;"arn:aws:budgets::&lt;/span&gt;&lt;span class="k"&gt;${data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_caller_identity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;account_id&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:budget/project_budget"&lt;/span&gt;
     &lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Data visualization with Quicksight&lt;/em&gt;*&lt;/p&gt;

&lt;p&gt;Quicksight supports S3 as a direct data source -- only a manifest file&lt;br&gt;
containing a description of the data stored in the bucket is needed.&lt;br&gt;
This is quite handy for data whose structure does not change or only&lt;br&gt;
very seldom.&lt;/p&gt;

&lt;p&gt;A more involving setup including AWS Glue and AWS Athena might be&lt;br&gt;
beneficial in cases where either a lot of details (not only basic cost&lt;br&gt;
information) are queried, or a lot of different AWS services are used&lt;br&gt;
over time in the different AWS accounts. It might happen that Quicksight&lt;br&gt;
runs into problems when trying to load this kind of data as it is going&lt;br&gt;
to change constantly, and the manifest file requires a lot of updates. A&lt;br&gt;
Glue Crawler combined with an Athena table might be the better approach&lt;br&gt;
in such a scenario.&lt;/p&gt;

&lt;p&gt;As soon as a new dataset based on the S3 bucket has been created, one or&lt;br&gt;
several dashboards can be implemented. They can represent some overview&lt;br&gt;
data like it is done in the example below or go into more detail --&lt;br&gt;
depending on the specific requirements. How to create these dashboards&lt;br&gt;
is out of scope of this article but Quicksight offers enough tooling to&lt;br&gt;
start from simple to go a long way to sophisticated information display.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--UZw58LUj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wr167vegkmuq3rln7pke.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--UZw58LUj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wr167vegkmuq3rln7pke.png" alt="Quicksight dashboard showing cost related details" width="880" height="524"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A Quicksight dashboard can either be shared with individuals in need of&lt;br&gt;
this information or a scheduled email notification can be established.&lt;br&gt;
Quicksight will sent a mail to all specified recipients which can&lt;br&gt;
include an image of the dashboard as well as some data in CSV format.&lt;br&gt;
This feature helps a lot it is not always necessary to login to keep the&lt;br&gt;
costs under control. Simply by receiving an automated message for&lt;br&gt;
instance every day or just once a week can already help to stay&lt;br&gt;
informed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrapping up&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Cost monitoring is an important topic for every project -- from small to&lt;br&gt;
large. AWS offers various tools to stay up to date but this task is&lt;br&gt;
getting tedious when following AWS best practice and separating an&lt;br&gt;
application into different stages and AWS accounts. There are 3rd party&lt;br&gt;
or company-internal tools available which helps to overcome this&lt;br&gt;
situation, but it is not always possible to use them (especially in an&lt;br&gt;
enterprise setup) or they come with their own drawbacks.&lt;/p&gt;

&lt;p&gt;This blog post has presented a small-scale application which offers&lt;br&gt;
enough information and details to monitor the costs generated by small&lt;br&gt;
to medium size projects. It has its own limitations as it might not be&lt;br&gt;
powerful enough when dealing with tens or even hundreds of AWS accounts&lt;br&gt;
-- but this is normally not the typical setup of a project.&lt;/p&gt;

&lt;p&gt;Photo credits&lt;br&gt;
Photo of Anna Nekrashevich: &lt;a href="https://www.pexels.com/de-de/foto/lupe-oben-auf-dem-dokument-6801648/"&gt;https://www.pexels.com/de-de/foto/lupe-oben-auf-dem-dokument-6801648/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>cloud</category>
      <category>python</category>
    </item>
    <item>
      <title>How not to send all your money to AWS</title>
      <dc:creator>Thomas Laue</dc:creator>
      <pubDate>Fri, 07 Oct 2022 12:52:33 +0000</pubDate>
      <link>https://forem.com/fmegroup/how-not-to-send-all-your-money-to-aws-2kdn</link>
      <guid>https://forem.com/fmegroup/how-not-to-send-all-your-money-to-aws-2kdn</guid>
      <description>&lt;p&gt;The AWS environment has grown to a kind of universe providing more than 250 services over the last 20 years. Many applications which quite often easily use 5-10 or more of these services benefit from the rich feature set provided. Burdens which have existed for instance in local data centers like server management have been taken over by AWS to provide developers and builders more freedom to be more productive, creative and more cost effective at the end.&lt;/p&gt;

&lt;p&gt;Billing and cost management at AWS are one of the hotter topics which have been discussed and complained about throughout the internet. AWS provides tools like the &lt;a href="https://calculator.aws/#/" rel="noopener noreferrer"&gt;AWS Pricing Calculator&lt;/a&gt; which helps to create cost estimates for application infrastructure. Significant efforts have been spent over the last couple of years to improve the &lt;a href="https://aws.amazon.com/aws-cost-management/aws-billing/" rel="noopener noreferrer"&gt;AWS Billing Console&lt;/a&gt; in order to provide better insights about daily and monthly spendings. However, it still can be hard to get a clear picture as every service and quite often also every AWS region has its own pricing structure.&lt;/p&gt;

&lt;p&gt;At the same time, more and more cost saving options have been released. Depending on the characteristics and architecture of an application it can be astonishing easy to save a considerable amount of money with no or only limited invest in engineering time using some of the following tips.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cleanup resources and data&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Probably the most obvious step to reduce costs is to delete unused resources like dangling EC2 or RDS instances, old EBS snapshots or AWS Backup files etc. S3 buckets containing huge amounts of old or unused data might as well be a good starting point to reduce costs.&lt;/p&gt;

&lt;p&gt;Beside not wasting money all these measures enhance the security at the same time: things which are no longer available cannot be hacked or leaked. AWS provides features like AWS Backup/S3 retention policies to support managing data lifecycle automatically so that not everything must be done manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS Saving Plans&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Savings plans which come in different flavours were released about 3 years ago. They offer discount opportunities of up to 72 percent compared to On-Demand pricing when choosing the "EC2 Instance Savings" plan. Especially the more flexible "Compute Savings Plans" which still offers up to 66 percent discount is quite attractive as it covers not only EC2, but Lambda and Fargate as well.&lt;/p&gt;

&lt;p&gt;Workloads running 24/7 with a somehow predictable workload are mostly suited for this type of offering. Depending on the selected term length &lt;br&gt;
(1 or 3 years) and payment option (No, Partial or All Upfront) a fixed discount is granted for a commitment of a certain amount of dollars spent for compute per hour. Architectural changes like switching EC2 instance types or moving workloads from EC2 to Lambda or Fargate are possible and covered by the Compute Savings Plans.&lt;/p&gt;

&lt;p&gt;Purchasing savings plans requires a minimum of work with a considerable savings outcome especially as most workloads require some sort of compute which contribute significantly to the total costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reserved Instances and Reserved Capacity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Unfortunately, AWS does not offer Savings Plans for all possible scenarios or AWS services but other powerful discount options like &lt;a href="https://aws.amazon.com/rds/reserved-instances/" rel="noopener noreferrer"&gt;Amazon RDS Reserved Instances&lt;/a&gt; come to a rescue. Reserved Instances use comparable configuration options like Savings Plans and promise a similar discount rate for workloads which require continuously running database servers.&lt;/p&gt;

&lt;p&gt;The flexibility of change is however limited and depends on the database used. Nevertheless, it is worth considering Reserved Instances as a cost optimization choice with again only a minimum amount of time invest necessary.&lt;/p&gt;

&lt;p&gt;Amazon DynamoDB, the serverless NoSQL database, offers a feature called Reserved Capacity. It reserves a guaranteed amount of read and write throughput per second per table. Similar term and payment conditions as already mentioned for Savings Plans and Reserved Instances apply here as well. Predictable traffic patterns benefit from cost reduction compared to the On-Demand or Provisioned throughput modes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Automatic Shutdown and Restart of EC2 and RDS Instances&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Many EC2 and RDS instances are only used during specific times during a day and most often not at all during weekend. This applies mostly for development and test environments but might also be valid for production workloads. A considerable amount of money can be saved by turning off these idle instances when they are not needed.&lt;/p&gt;

&lt;p&gt;An automatic approach which initiates and manages the shutdown and restart according to a schedule can take over this task so that nearly no manual intervention is needed. AWS provides a solution called "&lt;a href="https://aws.amazon.com/solutions/implementations/instance-scheduler/" rel="noopener noreferrer"&gt;Instance Scheduler&lt;/a&gt;" which can be used to perform this work if no certain start or shutdown logic for an application has to be followed.&lt;/p&gt;

&lt;p&gt;Specific workflows which require for instance to start the databases first, prior to launching any servers can be modelled by AWS Step Functions and executed using scheduled EventBridge rules. Step Functions is extremely powerful and supports a huge range of API operations so that nearly no custom code is necessary.&lt;/p&gt;

&lt;p&gt;An example of a real-world workflow which stops an application consisting of several RDS and EC2 instances is shown in the image below. A strict sequence of shutdown steps must be followed to make sure that the application stops correctly. This workflow is triggered every evening when the environment is no longer needed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvaekhjjle85abdrafhem.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvaekhjjle85abdrafhem.png" alt="AWS Step Functions workflow to stop a workload" width="800" height="913"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The counter part is given in the next example. This workflow is used to launch the environment every morning before the first developer starts working.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fik51w3l81xpg0bp7rseu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fik51w3l81xpg0bp7rseu.png" alt="AWS Step Functions workflow to start a workload" width="800" height="1743"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Shutting down all EC2 and RDS instances during night and over the weekend cut down the compute costs by about 50 percent in this project which is significant for a larger environment. The only caveat with this approach has been so far insufficient EC2 capacity when trying to restart the instances in the. It has happened very seldom, but took about half a day until AWS had enough resources available for a successful launch.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc9bzjsqrxvlkj6x369wv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc9bzjsqrxvlkj6x369wv.png" alt="Error message which pops up in case AWS cannot provide enough EC2 resources of a certain type" width="800" height="139"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Up and downscale of instances in test environments&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This option might not work in all cases as test (aka. UAT) environments often mirror the production workload by design to have nearly identical conditions when performing manual or automated tests. Especially load tests, but others as well should be executed based on production like systems as their results are not reliable otherwise. In addition, not every application runs on smaller EC2 instances as smooth as on larger ones respectively changing an instance size might require additional application configuration modifications.&lt;/p&gt;

&lt;p&gt;Nevertheless, it sometimes is possible to downscale them (RDS databases might be an additional option) when load and other heavy tests are not performed on a regular basis (even though this might be recommended in theory).&lt;/p&gt;

&lt;p&gt;Infrastructure as code frameworks like Terraform or CloudFormation make it relatively easy to define two configuration sets. They can be swapped prior to running a load test to upscale the environment. EC2 supports instance size modifications on the fly (no restart necessary) and even some RDS databases can be modified without system interruption. The whole up- or downscale process requires only a small amount of time (depending on the environment size and instance types) and can save a considerable amount of money.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Designing new applications with a serverless first mindset&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Serverless has become a buzzword and marketing term during the last few years (everything seems to be serverless nowadays), but in its core it is still a quite promising technological approach. Not having to deal with a whole bunch of administrative and operative tasks like provisioning and operating virtual servers or databases paired with the "pay only for what you use" model is quite appealing. Other advantages of serverless architectures should not be discussed in this article but can be found easily using your favorite search engine.&lt;/p&gt;

&lt;p&gt;Especially the "pay as you go" cost model counts towards direct cost optimization (excluding in this post topics like total cost of ownership and time to market which are important in practice as well). There is no need to shut down or restart anything when it is not needed. Serverless components do not contribute to your AWS bill when they are not used -- for instance at night in development or test environments. Even production workloads which often do not receive a constant traffic flow but more a spiky one benefit compared to an architecture based on containers or VMs.&lt;/p&gt;

&lt;p&gt;Not every application or workload is suited for a serverless design model. To be fair it should be mentioned that a serverless approach can get much more expensive than a container based one in case of very heavy traffic patterns. However, this is probably relevant for just a very small portion of all existing implementations.&lt;/p&gt;

&lt;p&gt;Quite often it is possible and beneficial to replace a VM or container by one or more Lambda function(s), a RDS database by DynamoDB or a custom REST/GraphQL API implementation by API Gateway or AppSync. The learning curve is steep and well-designed serverless architectures are not that easy to achieve at the beginning as a complete mind shift is required but believe me: this journey is worth the effort and makes a&lt;br&gt;
ton of fun after having gained some insights into this technology.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Think about what should be logged and send to CloudWatch&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Logging has been an important part of any application development and operation since the invention of software. Useful log outputs (hopefully in a structured form) can help to identify bugs or other deficiencies and provide a useful insight into a software system. AWS provides with CloudWatch a log ingesting, storage and analyzing platform.&lt;/p&gt;

&lt;p&gt;Unfortunately, log processing is quite costly. It is not an exception that the portion of the AWS bill which is related to CloudWatch is up to 10 times higher than for instance the one of Lambda in serverless projects. The same is valid for container or VM based architectures even though the ratio might not be that high, but still not neglectable. A concept how to deal with log output is advisable and might make a considerable difference at the end of the month.&lt;/p&gt;

&lt;p&gt;Some of the following ideas help to keep CloudWatch costs under control:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Change log retention time from "Never expire" to a reasonable value&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Apply log sampling like described in this &lt;a href="https://dev.to/tlaue/keep-your-cloudwatch-bill-under-control-when-running-aws-lambda-at-scale-3o40"&gt;post&lt;/a&gt; and for instance provided by the &lt;a href="https://awslabs.github.io/aws-lambda-powertools-python/latest/core/logger/" rel="noopener noreferrer"&gt;AWS Lambda Powertools&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Consider using a 3rd party monitoring system like     &lt;a href="https://lumigo.io/" rel="noopener noreferrer"&gt;Lumigo&lt;/a&gt; or &lt;a href="https://www.datadoghq.com/" rel="noopener noreferrer"&gt;Datadog&lt;/a&gt; instead of outputting a lot of log messages. These external systems are not for free and not always allowed to use (especially in an enterprise context) but provide a lot of additional features which can make a real difference.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In some cases, it might be possible to send logs directly to other systems (instead of ingesting them first into CloudWatch) or to store them in S3 and use Athena to get some insights.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Activate logging when needed and suitable but not always by default -- not every application requires for instance VPC flow logs or API Gateway access logs even though good reasons exist to do so in certain environments (due to security reasons or certain regulations snd company rules)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Logging is important and quite useful in most of the cases, but it makes sense to have an eye on the expenditures and to adjust the logging concept in case of sprawling costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrap up&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;All the cost optimization possibilities mentioned above can only scratch the surface of what is possible in the AWS universe. Things like S3 and DynamoDB storage tiers, EC2 spot instances and many others have not even been mentioned nor explained. Nevertheless, applying one or several of&lt;br&gt;
the strategies shortly discussed in this article can help to save a ton of money without having to spend weeks of engineering time. Especially Savings Plans and Reserved Instances as well as shutting down idle instances are easy and quite effective measures to reduce their contribution to the AWS bill by 30% to 50% for existing workloads. Newer ones which are suited for the serverless design model really benefit from its cost and operation model and provide a ton of fun for developers. &lt;/p&gt;

</description>
      <category>cloud</category>
      <category>serverless</category>
      <category>costs</category>
      <category>aws</category>
    </item>
    <item>
      <title>Refactor Terraform code with Moved Blocks - a new way without manually modifying the state</title>
      <dc:creator>Thomas Laue</dc:creator>
      <pubDate>Fri, 08 Jul 2022 11:07:49 +0000</pubDate>
      <link>https://forem.com/fmegroup/refactor-terraform-code-with-moved-blocks-a-new-way-without-manually-modifying-the-state-g2c</link>
      <guid>https://forem.com/fmegroup/refactor-terraform-code-with-moved-blocks-a-new-way-without-manually-modifying-the-state-g2c</guid>
      <description>&lt;p&gt;Most software and IT infrastructure projects which have been deployed to production have to deal with requirement changes during their lifetime. User expectations change, new use cases appear, traffic patterns are different than expected or new technology becomes available. Refactoring of existing code (application code as well as infrastructure-as-code) has always been an important task but also one of the major pain points in IT. A good support of refactoring tools and patterns can make a difference for a framework like Terraform compared with its competitors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setting the stage&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Terraform by HashiCorp -- one of the major players in the&lt;br&gt;
infrastructure-as-code framework world - has been around since 2014. It has been used to setup a lot of small, medium, and large projects all over the world. It provides a rich feature set to define infrastructure in a concise manner. One of its strengths is the way to create identical/similar resources using either the meta-argument &lt;code&gt;count&lt;/code&gt; or the newer version &lt;code&gt;for_each&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;count&lt;/code&gt; makes it very easy to define identical resources like shown in the listing below which defines a very basic setup for 3 EC2 instances running on AWS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;server_names&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"webserver1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"webserver2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"webserver3"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_instance"&lt;/span&gt; &lt;span class="s2"&gt;"web"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;server_names&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="nx"&gt;ami&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ami-0a1ee2fb28fe05df3"&lt;/span&gt;
  &lt;span class="nx"&gt;instance_type&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"t3.micro"&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;server_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Terraform stores references to resources created by using the &lt;code&gt;count&lt;/code&gt; meta-argument in its internal state in an array using an index-based approach.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6f0qk32w98fdpfcmv28m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6f0qk32w98fdpfcmv28m.png" alt="Terraform state listing showing three web server instance details"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This works fine if a single instance must not be replaced or deleted. Such an action will affect all resources which are located on a higher index in the array due to the nature Terraform manages its state.&lt;/p&gt;

&lt;p&gt;Trying to remove "webserver2" in the example above&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;server_names&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"webserver1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"webserver2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"webserver3"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="err"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;will result in the destruction of the EC2 instance tagged "webserver3" and a renaming of the previous named "webserver2" instance into "webserver3". The result does not correspond to the expressed intention.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkl5ec2v4gifkz0dfpmg3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkl5ec2v4gifkz0dfpmg3.png" alt="Terraform listing showing unintended result of refactoring a structure build upon  raw `count` endraw "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Version 0.12.6 of Terraform introduced the &lt;code&gt;for_each&lt;/code&gt; meta-argument - a more flexible way to create identical/similar resources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_instance"&lt;/span&gt; &lt;span class="s2"&gt;"web"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;server_names&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="nx"&gt;ami&lt;/span&gt;                         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ami-0a1ee2fb28fe05df3"&lt;/span&gt;
  &lt;span class="nx"&gt;instance_type&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"t3.micro"&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; 
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Terraform state references the resources no longer based on an index but by using a key-based approach. It is now possible to address a single resource without affecting others.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fstiice0glkzwlcwkxqm9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fstiice0glkzwlcwkxqm9.png" alt="Terraform listing showing correctly refactored structure based upon  raw `for_each` endraw "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The removal of "webserver2" can now be performed successfully without affecting other resources.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv3ljhaic16c74yw4cswd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv3ljhaic16c74yw4cswd.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Due to the greater flexibility of &lt;code&gt;for_each&lt;/code&gt; it might be helpful or even required to refactor existing code (migrate from &lt;code&gt;count&lt;/code&gt; to &lt;code&gt;for_each&lt;/code&gt;). This has been possible in the past by manipulating the Terraform state directly using the &lt;code&gt;terraform state mv&lt;/code&gt; CLI command. However, all manual state manipulations are brittle and prone to errors which make them as a kind of last resort.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;From imperative to explicit&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;HashiCorp introduced an improved refactoring experience with version 1.1 of Terraform: the &lt;a href="https://www.terraform.io/language/modules/develop/refactoring" rel="noopener noreferrer"&gt;&lt;code&gt;moved block&lt;/code&gt;&lt;/a&gt; syntax which allows to express refactoring steps in code instead of using an imperative attempt via CLI.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;moved block&lt;/code&gt; allows to specify the old and new reference of a resource like shown in the following example which has been rewritten to use &lt;code&gt;for_each&lt;/code&gt; instead of &lt;code&gt;count&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;server_names&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"webserver1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"webserver2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"webserver3"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;moved&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;web&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;to&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;web&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"webserver1"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;moved&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;web&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;to&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;web&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"webserver2"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;moved&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;web&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;to&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;web&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"webserver3"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_instance"&lt;/span&gt; &lt;span class="s2"&gt;"web"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;server_names&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="nx"&gt;ami&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ami-0a1ee2fb28fe05df3"&lt;/span&gt;
  &lt;span class="nx"&gt;instance_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"t3.micro"&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A following &lt;code&gt;terraform plan/apply&lt;/code&gt; reveals that no instance will be destroyed or modified in any way but only moved in the state from its old reference to its new one created by the way &lt;code&gt;for_each&lt;/code&gt; works. No need for any manual state manipulation anymore but everything can be done securely using Terraforms native way to work.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4vi8u35moj0xiu6gul18.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4vi8u35moj0xiu6gul18.png" alt="Terraform listing showing how moved blocks work"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;moved blocks&lt;/code&gt; cannot only be applied to refactor &lt;code&gt;count&lt;/code&gt; into&lt;br&gt;
&lt;code&gt;for_each&lt;/code&gt; syntax but also be used to rename resources, to move resources into modules and so on. Not everything is possible using the new language element, but many (not extremely complex) refactoring tasks can benefit from using it. Terraforms documentation contains different examples and use cases with further details.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wrap-up
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;moved blocks&lt;/code&gt; have made refactoring existing Terraform projects easier and safer to perform. No manual steps are required any longer for many use cases even though &lt;code&gt;terraform state mv&lt;/code&gt; is still there to solve problems which cannot be tackled by using the new element. It is helpful to have tooling/framework elements like this on at hand.   &lt;/p&gt;

&lt;p&gt;Depending on the type and size of the project (internal project or public module) it might make sense respectively it is even recommended by HashiCorp not to delete the blocks after having applied the changes. Not everyone using the module might already have fetched the latest version. Apart from avoiding trouble for users it might be helpful to document any significant changes on the project structure for later reviews. A short well written and dated comment combined with the &lt;code&gt;moved block&lt;/code&gt; syntax might answer your question or the one of a colleague six months down the road.&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>cloud</category>
      <category>devops</category>
    </item>
    <item>
      <title>Automate DevOps Workflows using AWS StepFunctions Service Integrations</title>
      <dc:creator>Thomas Laue</dc:creator>
      <pubDate>Tue, 31 May 2022 18:48:45 +0000</pubDate>
      <link>https://forem.com/fmegroup/automate-devops-workflows-using-aws-stepfunctions-service-integrations-250c</link>
      <guid>https://forem.com/fmegroup/automate-devops-workflows-using-aws-stepfunctions-service-integrations-250c</guid>
      <description>&lt;p&gt;&lt;a href="https://aws.amazon.com/step-functions/?step-functions.sort-by=item.additionalFields.postDateTime&amp;amp;step-functions.sort-order=desc" rel="noopener noreferrer"&gt;AWS Step Functions&lt;/a&gt;, a serverless workflow orchestration service offering by AWS, has been around since several years now. Many blog posts (like &lt;a href="https://aws.amazon.com/blogs/devops/using-aws-step-functions-state-machines-to-handle-workflow-driven-aws-codepipeline-actions/" rel="noopener noreferrer"&gt;Using AWS Step Functions State Machines to Handle Workflow-Driven AWS CodePipeline Actions&lt;/a&gt;), presentations and learning courses (e.g. &lt;a href="https://theburningmonk.thinkific.com/courses/complete-guide-to-aws-step-functions" rel="noopener noreferrer"&gt;Complete guide to AWS Step Functions&lt;/a&gt;) have been published showing the capabilities and rich feature set provided. &lt;/p&gt;

&lt;p&gt;However not many of them deal with topics related to DevOps tasks -- maybe because Step Functions only offered a limited set of direct service integrations like &lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt; until recently. Accessing an AWS API required using for instance an AWS SDK or AWS CLI commands in a script or Lambda function, but this changed a few months ago.&lt;/p&gt;

&lt;p&gt;In September 2021 AWS added support for over 200 AWS Services with AWS SDK integration resulting in over 9000 AWS API Actions available. Only a few weeks before, another major enhancement, the new Workflow Studio -- a low-code visual tool for building state machines, had been released so that it is now easier than ever to build workflows -- from simple to complex.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The challenge&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Around the same time, we joined a migration project at a customer who was moving a large application which had been hosted on-premises so far to AWS using services like EC2, RDS, ALB.... Some of the typical operational tasks like managing the database servers are now gone as AWS takes care for the heavy lifting but new ones have arrived and others stay the same.&lt;/p&gt;

&lt;p&gt;As the project proceeded, we thought about how we could automate as many operational tasks as possible using native AWS services. AWS Step Functions Service Integrations came right around the corner to make our life much easier. We were able to handle many repeating tasks by creating state machines which are sometimes triggered by scheduled &lt;a href="https://aws.amazon.com/eventbridge/" rel="noopener noreferrer"&gt;Amazon EventBridge&lt;/a&gt; rules or used manually via CLI or Console.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The simple one&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fex7ztjkpman23lfcdeei.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fex7ztjkpman23lfcdeei.png" alt="Simple Workflow" width="505" height="401"&gt;&lt;/a&gt; A workflow consisting only of two steps (neglecting &lt;em&gt;Start&lt;/em&gt; and &lt;em&gt;End&lt;/em&gt;) is triggered shortly before the next EC2 maintenance window to get an overview about all security patches which will be installed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/systems-manager/" rel="noopener noreferrer"&gt;AWS Systems Manager&lt;/a&gt;`s service integration &lt;em&gt;ssm:describeInstancePatches&lt;/em&gt; is used to get the list all patches which will be sent to an AWS SNS topic in order to be delivered to an email inbox of someone who is in charge to check if there might be a conflict ahead with the application requirements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq32d1x13m801vxhc2z2h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq32d1x13m801vxhc2z2h.png" alt="API Parameter" width="586" height="450"&gt;&lt;/a&gt; The Workflow Studio editor makes it quite easy to assemble a workflow and to enrich every step with the required parameters and settings. All service integrations are based on the AWS SDK API calls so that the parameters can be retrieved from the SDK documentation (an example is shown for Systems Manager API).&lt;/p&gt;

&lt;p&gt;Workflow Editor allows exporting the state machine definition to a JSON or YAML file so that it can be included into an infrastructure as code project using for instance Terraform.&lt;/p&gt;

&lt;p&gt;Information like the EC2 instance ID or the SNS topic ARN can be derived during deploy time using for instance Terraform template variables as shown in the example JSON state machine definition.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyr4wdkjfxo9rlerohr41.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyr4wdkjfxo9rlerohr41.png" alt="State Machine Definition" width="800" height="809"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The big benefit of using Step Functions is that no custom code and no additional overhead for managing a Lambda function is required to complete this task and the best thing: the state machine is quite intuitive to create, self-documenting and easy to follow and to recap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The more complex process&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Following the sample principles, it is possible to create more complex workflows. The given example shows a workflow which is used to restart all servers belonging to the web app tier which are behind an AWS application load balancer in a rolling manner. No application downtime is required in order to restart them as only a certain number is restarted at once.&lt;/p&gt;

&lt;p&gt;In the first step, the alarm actions of some CloudWatch alarms and which should not fire during the restart process and some AWS EventBridge rules are disabled using a Lambda function as the logic to filter these resources needs some custom code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffpg0gjq5r128tc09tcxz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffpg0gjq5r128tc09tcxz.png" alt="More Complex Workflow" width="774" height="973"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A property of the AWS Step Functions &lt;em&gt;Map&lt;/em&gt; state, the &lt;em&gt;Maximum&lt;br&gt;
Concurrency Control&lt;/em&gt;, is used to restrict the number of instances which are deregistered from the ALB target group, followed by a reboot and a final check if the application has been launched successfully before bringing it back into the target group.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhw0oslih1nxnhhwyihu1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhw0oslih1nxnhhwyihu1.png" alt="Concurrency Control Setting" width="572" height="161"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Rebooting only a limited number of instances makes sure that the application stays online, and that always enough servers are available to handle user traffic without a significant influence on the user experience.&lt;/p&gt;

&lt;p&gt;The new AWS SDK service integrations help again to model the workflow as a sequence of steps must be followed in order to reboot a running instance successfully. Not only has a server to be de-/registered from the target group (among others using &lt;em&gt;elasticloadbalancingv2:registerTargets&lt;/em&gt; SDK command) but also to be rebooted (&lt;em&gt;ec2:rebootInstances&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;After a certain wait period, an application startup check is performed to make sure that everything is working correctly using a Lambda function as the whole check process requires again some custom logic. Only a healthy and working server should be put back into the ALB target group.&lt;/p&gt;

&lt;p&gt;The application requires some minutes to get everything sorted out until it is ready to serve whereby the startup time  various depending on factors like external database connections... The &lt;em&gt;Wait&lt;/em&gt; state helps in this case to pause the workflow for a certain time. Nevertheless, it can happen that the following startup check fails as the application is not yet ready and another wait period is required.&lt;/p&gt;

&lt;p&gt;An in-build "for-loop" feature for Step Functions would be quite helpful in this case to re-run the last two steps (wait + startup check) again. It is possible to model this construct using a &lt;em&gt;Choice&lt;/em&gt; state which checks the result return from startup check Lambda function and acts upon it (i.e., go back to the &lt;em&gt;Wait&lt;/em&gt; state if the application is not ready yet). &lt;/p&gt;

&lt;p&gt;However, this feels somehow clumsy and more like a workaround. Additionally, a break condition (e.g., max. number of checks is required) which introduces a stateful condition which must be passed somehow around or stored somewhere.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmphfu3g9jh3qv9pl5mz2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmphfu3g9jh3qv9pl5mz2.png" alt="Kind of For-Loop" width="682" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Custom Retry and Error Handling&lt;/em&gt; for Lambda functions, another cool feature of Step Functions, comes to our rescue. Custom errors which are thrown from a Lambda function can be handled. Depending on the use case, a &lt;em&gt;Catcher&lt;/em&gt; or a &lt;em&gt;Retrier&lt;/em&gt; for this custom error class might be defined to deal with this situation. The later one is used to simulate a "for-loop" without relying on the &lt;em&gt;Choice&lt;/em&gt; state workaround.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkyz8bsj4v7myqoxfxj45.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkyz8bsj4v7myqoxfxj45.png" alt="Code Snipped of Custom Error Class" width="661" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Lambda raises a custom &lt;em&gt;InstanceNotYetStartedException&lt;/em&gt; in case the health check fails. This exception is handled by a specific &lt;em&gt;Retrier&lt;/em&gt; which defines a longer wait interval (120 seconds) to give the application some additional time before the next check. This whole procedure is repeated up to three times in this case until it can be assumed that something went wrong and should be handled otherwise (processing moves on to a &lt;em&gt;States.ALL Catcher&lt;/em&gt; which calls a SNS integration step for publishing an alarm).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl7yjm9flexmq0ou510ol.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl7yjm9flexmq0ou510ol.png" alt="Error and Retry Handling Parameters" width="556" height="881"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As a last note to this workflow: the &lt;em&gt;Map&lt;/em&gt; state fails as soon as one if its execution has failed. All running inner executions are aborted and all waiting once are cancelled. Care should be taken for this scenario: adding a dead later queue to the inner &lt;em&gt;Map&lt;/em&gt; state workflow would be one option, defining a &lt;em&gt;States.ALL Catcher&lt;/em&gt; on the &lt;em&gt;Map&lt;/em&gt; state level another one or even failing the complete state machine execution by purpose. The best error handling method depends on the workflow requirements. The global &lt;em&gt;Catcher&lt;/em&gt; is used in the presented case as some additional steps (putting the deactivated CloudWatch alarms back on place) must be&lt;br&gt;
executed in all cases.&lt;/p&gt;

&lt;p&gt;When not to use&lt;/p&gt;

&lt;p&gt;Step Functions has some limits like every other AWS service which might prevent one from using it in some rare cases or which requires a workaround. Furthermore, there are external API properties which might not fit to Step Functions. Two examples should shortly be discussed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Maximum input/output size for a task is 256 KB: AWS API calls might return a lot of JSON data but there are various mechanisms like the &lt;em&gt;filters&lt;/em&gt; parameter and &lt;em&gt;pagination&lt;/em&gt; support in place to narrow down the scope of a request. Additionally, Step Functions provide output processing functions to extract the data of interest so that this limitation should not be a blocker for most use cases.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How to deal with API calls supporting pagination: many AWS API endpoints return a maximum number of items and an additional &lt;em&gt;NextToken&lt;/em&gt; value which can be used to retrieve the next batch with a following call. The clumsy &lt;em&gt;Choice&lt;/em&gt;-state construct mentioned above could be used to handle this, but this is not practical. A Lambda function is much more suited in this situation in case a lot of data must be retrieved.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Wrapping up&lt;/p&gt;

&lt;p&gt;This blog presents use cases for Step Functions which might not be the most common ones out there but proved to be extremely useful. The new SDK integrations have opened a wide field of possibilities to model workflows visually without writing a lot of custom code (even though Lambda is always there if something cannot be solved by in-build mechanisms).&lt;/p&gt;

&lt;p&gt;The Step Functions Workflow Studio allows to design and build-up workflows from simple to quite complex ones in an intuitive and rapid way. The ready-to-be-used workflow can be exported to code (is JSON code?) so that a developer's heart does not need to cry and the integration into an infrastructure as code framework can be made.&lt;/p&gt;

&lt;p&gt;Some additional features like more intrinsic functions (e.g., string processing) to deal with the sometimes very large JSON results of AWS SKD calls would make working with Step Functions even easier (big point for &lt;a href="https://twitter.com/search?q=%23awswishlist" rel="noopener noreferrer"&gt;#awswishlist&lt;/a&gt;)&lt;/p&gt;

</description>
      <category>python</category>
      <category>serverless</category>
      <category>aws</category>
      <category>devops</category>
    </item>
    <item>
      <title>Keep your CloudWatch bill under control when running AWS Lambda at scale</title>
      <dc:creator>Thomas Laue</dc:creator>
      <pubDate>Tue, 19 Jan 2021 21:11:56 +0000</pubDate>
      <link>https://forem.com/tlaue/keep-your-cloudwatch-bill-under-control-when-running-aws-lambda-at-scale-3o40</link>
      <guid>https://forem.com/tlaue/keep-your-cloudwatch-bill-under-control-when-running-aws-lambda-at-scale-3o40</guid>
      <description>&lt;p&gt;In this post, I am showing a way how to keep the AWS CloudWatch costs caused by log messages coming from AWS Lambda under control without losing insights and debug information in case of errors. A logger with an included cache mechanism is presented. It manages the number of messages sent to AWS CloudWatch depending on the log level and function invocation result.   &lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Lambda and AWS CloudWatch
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/lambda/" rel="noopener noreferrer"&gt;AWS Lambda&lt;/a&gt;, the serverless compute service offered by AWS, sends all log messages (platform as well as custom messages) to &lt;a href="https://aws.amazon.com/cloudwatch/" rel="noopener noreferrer"&gt;AWS CloudWatch&lt;/a&gt;. Log messages are sorted into log groups and streams which are associated with the Lambda function and its invocations from which the messages originated.&lt;/p&gt;

&lt;p&gt;Depending on the AWS region CloudWatch charges for data ingestion (up to $0.90 per GB) and data storage (up to $0.0408 per GB and month). These fees sum up really quickly and it is not uncommon to spend a lot more on CloudWatch logs (sometimes up to 10 times more) than on Lambda itself in a production environment. In addition, log files are often sent from CloudWatch to 3rd party systems for analyzation adding even more spendings to the bill. &lt;/p&gt;

&lt;h3&gt;
  
  
  Logging
&lt;/h3&gt;

&lt;p&gt;Nevertheless, log files are an important resource to debug problems and to get deeper insights into the behavior of a serverless system. Every logged detail might help to identify issues and to fix bugs and problems. &lt;a href="https://theburningmonk.com/2018/01/you-need-to-use-structured-logging-with-aws-lambda/" rel="noopener noreferrer"&gt;Structured logging&lt;/a&gt; is important as log files can be analyzed much easier (e.g. with &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html" rel="noopener noreferrer"&gt;AWS CloudWatch Insights&lt;/a&gt;) which will save time and engineering costs. The &lt;a href="https://github.com/getndazn/dazn-lambda-powertools/tree/master/packages/lambda-powertools-logger" rel="noopener noreferrer"&gt;dazn-lambda-powertools&lt;/a&gt; library provides a logger that supports structured logging for Node.js, the &lt;a href="https://awslabs.github.io/aws-lambda-powertools-python/" rel="noopener noreferrer"&gt;AWS Lambda Powertools&lt;/a&gt; offer the same for Python and Java.     &lt;/p&gt;

&lt;p&gt;Furthermore, it is highly recommended to reduce the retention time of Cloudwatch log groups to a suitable time period. By default, logs will be stored forever leading to increasing costs over time. The retention policy for every log group might be changed manually using the AWS Console or preferably by using an automated approach provided for instance by &lt;a href="https://serverlessrepo.aws.amazon.com/applications/us-east-1/374852340823/auto-set-log-group-retention" rel="noopener noreferrer"&gt;this&lt;/a&gt; AWS SAR app. &lt;/p&gt;

&lt;p&gt;Finally, &lt;a href="https://theburningmonk.com/2018/04/you-need-to-sample-debug-logs-in-production/" rel="noopener noreferrer"&gt;sampling debug logs&lt;/a&gt; might cut off the biggest part of the CloudWatch Logs bill especially when running AWS Lambda at scale without losing the complete insight into the system. Depending on the sampling rate (which has to be representable for a workload), a certain amount of debugging information is available for monitoring and diagnostics. &lt;/p&gt;

&lt;p&gt;The following image shows a CloudWatch log stream belonging to a Lambda function for which a sampling rate of 10 % was used for demonstration purposes. A reasonable value for production will probably be much lower (e.g. 1%).  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fdp2up6aj533r2gb7bbdf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fdp2up6aj533r2gb7bbdf.png" alt="Alt CloudWatch log stream with debug output for about every 10th Lambda invocation"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Problem with sampling debug logs
&lt;/h3&gt;

&lt;p&gt;Nevertheless - as life goes - the sampling might not be in place when something goes wrong (e.g. a bug which only happens for edge cases) leaving a developer without detailed information to fix this issue. For instance, the invocation event or parameters for database or external API requests,  are of interest in case of issues. &lt;/p&gt;

&lt;p&gt;A logger that caches all messages which are not written to the output stream as their severity is below the defined log level could be used. The cached messages would only be sent to CloudWatch in case of a program error - in addition to the error information to get a full picture of the function invocation. This idea originated from the Production-Ready Serverless course by &lt;a href="https://theburningmonk.com/" rel="noopener noreferrer"&gt;Yan Cui&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A reduced version of the logger which is based on the dazn-lambda-powertools-logger:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;log&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@dazn/lambda-powertools-logger&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;LogLevels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;DEBUG&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;WARN&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;ERROR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="nx"&gt;logMessages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="nx"&gt;level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DEBUG&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="nx"&gt;level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;level&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;handleMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;levelName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;debug&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;levelName&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;LogLevels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;levelName&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toUpperCase&lt;/span&gt;&lt;span class="p"&gt;()];&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;level&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;LogLevels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="nx"&gt;level&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addToCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;levelName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;addToCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;levelName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="nx"&gt;logMessages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;levelName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;writeAllMessages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// The log level of the log has to be set do "debug" as&lt;/span&gt;
      &lt;span class="c1"&gt;// the current log level might prevent messages from&lt;/span&gt;
      &lt;span class="c1"&gt;// being logged.&lt;/span&gt;
      &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enableDebug&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="nx"&gt;logMessages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;levelName&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toLowerCase&lt;/span&gt;&lt;span class="p"&gt;()](...&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resetLevel&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nf"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;globalLogger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handleMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;debug&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;globalLogger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handleMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;info&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;globalLogger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handleMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;warn&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;globalLogger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handleMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nf"&gt;writeAllMessages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;globalLogger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeAllMessages&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;globalLogger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Logger&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The logger provides methods for the most common log levels. A message is either written to the output stream or added to the internal cache depending on the current log level defined in the Lambda environment. If required all cached messages can be logged out as well using the "writeAllMessages" method.    &lt;/p&gt;
&lt;h3&gt;
  
  
  How to use the logger within AWS Lambda
&lt;/h3&gt;

&lt;p&gt;All required logic (including sample logging configuration) has been added to a wrapper that receives the Lambda handler function as an argument. This wrapper can be reused for any Lambda function and published for instance in a private NPM package.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;middy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;middy&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sampleLogging&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@dazn/lambda-powertools-middleware-sample-logging&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;log&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./logger&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lambdaHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lambdaWrapper&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Input event...`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;lambdaHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s2"&gt;`Function [&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;functionName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;] finished successfully with result: [&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nx"&gt;response&lt;/span&gt;
        &lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;] at [&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;]`&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;writeAllMessages&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;middy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lambdaWrapper&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nf"&gt;sampleLogging&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;sampleRate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;parseFloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SAMPLE_DEBUG_LOG_RATE&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0.01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;An example of a simple Lambda handler in which some user information is retrieved from DynamoDB is given below. This function fails on a random basis to demonstrate logger behavior.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DynamoDB&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@aws-sdk/client-dynamodb&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;marshall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;unmarshall&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@aws-sdk/util-dynamodb&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dynamoDBClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DynamoDB&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;eu-central-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queryStringParameters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;age&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getUserDetailsFromDB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;An error occurred&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;age&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Response...`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;getUserDetailsFromDB&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Get user information for user with id...`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Item&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;dynamoDBClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getItem&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;TableName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TABLE_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;marshall&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userDetails&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;unmarshall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Retrieved user information...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userDetails&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;userDetails&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;wrapper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;A small sample application (as shown by the &lt;a href="https://lumigo.io" rel="noopener noreferrer"&gt;lumigo&lt;/a&gt; platform) demonstrates the different logger behavior: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fsrkh3trjl2ef8u3enqn9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fsrkh3trjl2ef8u3enqn9.png" alt="Alt Sample application"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;A successful invocation of the sample app with log level set to "INFO" does not write out any debug message (only in the rare case of a sampled invocation):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fdhbnb5cyxu053zks17vl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fdhbnb5cyxu053zks17vl.png" alt="Alt Successful invocation of the sample application with resulting log stream"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;However, all debug information will be sent to CloudWatch Logs in case of an error as can been seen below:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fvrou1oz8cpos5txxouzl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fvrou1oz8cpos5txxouzl.png" alt="Alt Failed invocation of the sample application with resulting log stream"&gt;&lt;/a&gt; &lt;/p&gt;
&lt;h3&gt;
  
  
  Caveats
&lt;/h3&gt;

&lt;p&gt;Platform errors like timeouts or out of memory issues will not trigger the logger logic as the function will not run to its end but will be terminated by the Lambda runtime.    &lt;/p&gt;
&lt;h3&gt;
  
  
  Takeaways
&lt;/h3&gt;

&lt;p&gt;Logging is one of the important tools to get some insights into the behavior of any system including AWS Lambda. CloudWatch Logs centralizes and manages all logs from most AWS services. It is not free but there are possibilities like to sample logs in production to reduce the bill. As this might result in NO logs in case of an error, a logger with an internal cache has been presented which outputs all logs but only in case of a problem. This logger can be combined with the sample logging strategy to keep the bill low but get all information when it is really required.&lt;/p&gt;

&lt;p&gt;Let me know if you found this useful and what other approaches are used to keep the CloudWatch bill reasonable without losing all insights. Thank you for reading.  &lt;/p&gt;

&lt;p&gt;The full code including a small test application can be found in: &lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/TLaue" rel="noopener noreferrer"&gt;
        TLaue
      &lt;/a&gt; / &lt;a href="https://github.com/TLaue/logger-with-cache" rel="noopener noreferrer"&gt;
        logger-with-cache
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      An example of a logger for AWS Lambda which caches all messages 
    &lt;/h3&gt;
  &lt;/div&gt;
&lt;/div&gt;



</description>
      <category>aws</category>
      <category>javascript</category>
      <category>serverless</category>
      <category>cloud</category>
    </item>
  </channel>
</rss>
