<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Maciej Radzikowski</title>
    <description>The latest articles on Forem by Maciej Radzikowski (@mradzikowski).</description>
    <link>https://forem.com/mradzikowski</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F131827%2F10fdbad2-0794-42cf-a075-ba82c6c4c48c.jpeg</url>
      <title>Forem: Maciej Radzikowski</title>
      <link>https://forem.com/mradzikowski</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mradzikowski"/>
    <language>en</language>
    <item>
      <title>25 Good and Bad Serverless (and other) Announcements from re:Invent 2023</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Mon, 04 Dec 2023 16:02:31 +0000</pubDate>
      <link>https://forem.com/aws-builders/25-good-and-bad-serverless-and-other-announcements-from-reinvent-2023-18oc</link>
      <guid>https://forem.com/aws-builders/25-good-and-bad-serverless-and-other-announcements-from-reinvent-2023-18oc</guid>
      <description>&lt;p&gt;Originally published at &lt;a href="https://betterdev.blog/25-good-and-bad-serverless-announcements-reinvent-2023/"&gt;BetterDev.blog&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AWS re:Invent 2023 ended, so let’s look at the most interesting announcements for serverless and more. There are exciting &lt;strong&gt;features to try&lt;/strong&gt; and, maybe more importantly, &lt;strong&gt;features to avoid&lt;/strong&gt;!&lt;/p&gt;

&lt;p&gt;Looking at the announcements, I think &lt;strong&gt;we are entering the next era for serverless and cloud computing&lt;/strong&gt; in general. There are fewer groundbreaking new services or big features. Instead, there are more quality of (developer) life improvements. Enough to say that four of the announcements below are about CloudWatch Logs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;And that’s a great thing.&lt;/strong&gt; It means that things are stable and most capability gaps are closed, so we can focus on building our solutions on top instead of making workarounds (in most cases, at least).&lt;/p&gt;

&lt;p&gt;That doesn’t mean there were no important updates. For me, &lt;strong&gt;Step Functions and CloudWatch Logs teams are winning&lt;/strong&gt; this re:Invent in terms of the best improvements and new features.&lt;/p&gt;

&lt;p&gt;Okay, let’s see what’s new.&lt;/p&gt;

&lt;h2&gt;
  
  
  Serverless
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step Functions: HTTP request task
&lt;/h3&gt;

&lt;p&gt;The new task type allows you to &lt;strong&gt;make HTTP requests directly from the Step Function&lt;/strong&gt;. No more need for a Lambda function to make an external API call.&lt;/p&gt;

&lt;p&gt;What’s best is that it’s &lt;strong&gt;really versatile&lt;/strong&gt;. Instead of creating integrations with specific API partners, &lt;strong&gt;the Step Functions team lets you connect to almost any API&lt;/strong&gt;. You can set the request method, body, headers, and query parameters and even encode the body in the &lt;code&gt;x-www-form-urlencoded&lt;/code&gt; form. There are some limitations, sure – the request body must be provided as JSON, and not all headers are allowed. So no XML or SOAP requests, but I don’t think that’s a big issue, even though I still integrate with such APIs occasionally.&lt;/p&gt;

&lt;p&gt;An interesting design choice is &lt;strong&gt;using EventBridge Connections for authorization&lt;/strong&gt;. That makes sense – it’s already in place and supports basic auth, API keys, and oAuth. But don’t be afraid; it does not use EventBridge API Destinations, so there is no 5-second timeout limit (although I could not find any specific timeout mentioned).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--hgRxxhIZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dyammnlo7agj0rnjiedn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--hgRxxhIZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dyammnlo7agj0rnjiedn.png" alt="Step Functions HTTP step parameters" width="800" height="822"&gt;&lt;/a&gt;&lt;/p&gt;
Step Functions HTTP step parameters



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; Easily top #1 re:Invent announcements for me. &lt;strong&gt;My only issue is Secrets Manager&lt;/strong&gt;, which you must use for secrets when creating EventBridge Connection. While EventBridge then makes its own copy of the secret, which is free with no charges for using it, the original one still costs flat $0.40 per secret to store a few bytes of API key.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; I'm loving it 😍&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/external-endpoints-and-testing-of-task-states-now-available-in-aws-step-functions/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step Functions: redrive failed execution
&lt;/h3&gt;

&lt;p&gt;Step Functions execution failed in the middle because of an external error and now you have to re-run the whole process? Not anymore. You can &lt;strong&gt;redrive execution starting from the failed state&lt;/strong&gt;. Important to know is that the redrive will run on exactly the same Step Function definition, so you can’t modify Step Function and re-run the failed step using the new, fixed workflow version.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--54QfhEB6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/9zvcpz330iizotocmpno.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--54QfhEB6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/9zvcpz330iizotocmpno.png" alt="Step Functions failure retry and redrive" width="800" height="353"&gt;&lt;/a&gt;&lt;/p&gt;
Step Functions failed step – retries and then manual redrive



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; It’s never the fault of our code, right? But jokes aside, it’s very useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; SF team admiration 🤗&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/compute/introducing-aws-step-functions-redrive-a-new-way-to-restart-workflows/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step Functions: test step
&lt;/h3&gt;

&lt;p&gt;Developing and testing Step Functions can be annoying when you make changes, run execution, wait for it to reach your modified state, see errors, fix, rerun the whole thing… You get what I mean. Wouldn’t it be great if you could &lt;strong&gt;run an individual state to test it&lt;/strong&gt;?&lt;/p&gt;

&lt;p&gt;Well, now you can. You can select and run a state in the Step Function edit view, providing only the state input. What’s great is you get very detailed, step-by-step (you see what I did here?) &lt;strong&gt;logs on input/output processing&lt;/strong&gt;. This will allow you to catch all problems like missing IAM permissions, incorrect result selectors, etc.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jGPadqB_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zokojjqlxg5mshofza7s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jGPadqB_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/zokojjqlxg5mshofza7s.png" alt="Step Functions HTTP test state results" width="800" height="584"&gt;&lt;/a&gt;&lt;/p&gt;
Step Functions test state shows step input and output processing details



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; Total saved developer-hours of Step Functions users will be counted in thousands.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; amazed 🤩&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/external-endpoints-and-testing-of-task-states-now-available-in-aws-step-functions/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda: much faster scaling up
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Before&lt;/strong&gt;, when handling increased traffic, Lambda would scale up by creating up to 500 to 3,000 new execution environments in the first minute (depending on the region) and an additional 500 environments every minute. Moreover, those scaling quotas were shared by all Lambda functions in the account region.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Now&lt;/strong&gt;, it can &lt;strong&gt;scale up by 1,000 environments every 10 seconds&lt;/strong&gt;. That’s… much faster. And each function is scaled independently, meaning each of your Lambda functions can create up to 1,000 new environments every 10 seconds.&lt;/p&gt;

&lt;p&gt;There is still the limit of total account concurrency. The default quota for Lambda concurrent executions is 1,000, and even lower for new accounts. But you can request it raised to "tens of thousands".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--a2fVL3n7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8bsql3g0pbjq7m7detqz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--a2fVL3n7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8bsql3g0pbjq7m7detqz.png" alt="Lambda scaling example" width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

Lambda scaling example – 1000 new environments every 10 seconds, up to account concurrency limit (&lt;a href="https://aws.amazon.com/blogs/aws/new-amazon-cloudwatch-log-class-for-infrequent-access-logs-at-a-reduced-price/"&gt;source&lt;/a&gt;)




&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; That's great. And you don't have to do anything to benefit from this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; positively shocked 😲&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/aws-lambda-functions-now-scale-12-times-faster-when-handling-high-volume-requests/"&gt;announcement post&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/scaling-behavior.html"&gt;docs&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda: future runtime launch dates
&lt;/h3&gt;

&lt;p&gt;Next to Lambda runtime deprecation dates, the AWS docs now include the &lt;strong&gt;target&lt;/strong&gt; dates for the new runtime version releases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Why is it so important I put it on the list? Because if I’m tired of people constantly asking "when Python x", "when Node y", then the AWS Lambda team must be fed up, too.&lt;/p&gt;

&lt;p&gt;Also, since we’re on this subject, &lt;strong&gt;I don’t think people understand&lt;/strong&gt; that adding a new runtime version is for AWS more than &lt;code&gt;wget https://nodejs.org/en/download/the-latest-node-version.zip&lt;/code&gt; and it also includes a &lt;strong&gt;commitment to maintenance for extended time&lt;/strong&gt; for tens of thousands of clients. So people asking for the new runtime on the same day as its release are, frankly, delusional.&lt;/p&gt;

&lt;p&gt;Also, AWS already improved the runtime upgrade cycle. Now, with this added transparency on the &lt;strong&gt;expected&lt;/strong&gt; release dates, I’m totally satisfied.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; relieved 😮‍💨&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html#runtimes-future"&gt;docs&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda: advanced logging controls
&lt;/h3&gt;

&lt;p&gt;You can now select &lt;strong&gt;JSON log format&lt;/strong&gt; for Lambda function and use language-default logging tools, like &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/Console"&gt;Node.js console object&lt;/a&gt; and &lt;a href="https://docs.python.org/3/library/logging.html"&gt;Python logging module&lt;/a&gt;, to log messages in &lt;strong&gt;structured JSON format&lt;/strong&gt;. When using it, you set the log level in Lambda settings, which is a nice improvement over a standard environment variable. Additionally, system logs like &lt;code&gt;START&lt;/code&gt;, &lt;code&gt;END&lt;/code&gt;, and &lt;code&gt;REPORT&lt;/code&gt; are also logged as JSON.&lt;/p&gt;

&lt;p&gt;Another feature of advanced logging controls is &lt;strong&gt;setting a custom log group instead of the auto-generated one&lt;/strong&gt;, which can be helpful if you want to aggregate logs from multiple functions in a single place.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--NUHiFO2H--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/r1axbnxt0i81kex620b5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--NUHiFO2H--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/r1axbnxt0i81kex620b5.png" alt="Lambda advanced logging configuration" width="800" height="777"&gt;&lt;/a&gt;&lt;/p&gt;
Lambda advanced logging configuration



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; Initially, &lt;strong&gt;I was ecstatic&lt;/strong&gt; – no need for added dependencies for the logger! &lt;strong&gt;Then I tested it.&lt;/strong&gt; On Node.js, &lt;a href="https://twitter.com/radzikowski_m/status/1725500755153687036"&gt;the extra parameters added in the standard way are inlined as string, not JSON fields&lt;/a&gt;. On Python, &lt;a href="https://twitter.com/radzikowski_m/status/1725511993551904938"&gt;the DEBUG log level includes logs from boto3&lt;/a&gt;, and there is a lot of them. &lt;strong&gt;A big no-no for me.&lt;/strong&gt; I’m continuing &lt;a href="https://betterdev.blog/aws-lambda-logging-best-practices/#log_as_json"&gt;logging as JSON&lt;/a&gt; with logger libraries, which also provide features like log sampling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; big sad 😭&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/compute/introducing-advanced-logging-controls-for-aws-lambda-functions/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudWatch Logs: cheaper log class
&lt;/h3&gt;

&lt;p&gt;CloudWatch Logs, with $0.50 per GB of ingested data, can quickly become one of the top costs for serverless applications. That’s why it’s essential to follow some &lt;a href="https://betterdev.blog/aws-lambda-logging-best-practices/"&gt;best practices&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Now&lt;/strong&gt;, however, you can choose the &lt;strong&gt;Infrequent Access&lt;/strong&gt; log class for your Log Group and &lt;strong&gt;pay only half the price for the ingest&lt;/strong&gt; – $0.25 per GB. There are, obviously, tradeoffs – not all features are available with Infrequent Access. You can’t create subscription filters, export to S3, use Live Tail, Lambda Insights, the new Anomaly Detection, or use metric filtering or embedded metric format for creating metrics from logs. Additionally, you can see them only through Logs Insights, not the regular log group stream view, so &lt;strong&gt;the cost for reading logs is $0.005 per GB of data scanned&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The lack of CloudFormation support&lt;/strong&gt; for now makes it only a theoretical feature for any IaC (unless you want it very much and use custom resources to create it on your own). The CDK may be the first to support it since it creates Lambda Log Groups programmatically anyway.&lt;/p&gt;

&lt;p&gt;Apart from that, if you don’t need any of the unsupported capabilities, it’s definitely worth considering, especially for the production environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; mixed feelings 🥲&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/new-amazon-cloudwatch-log-class-for-infrequent-access-logs-at-a-reduced-price/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudWatch Logs: pattern grouping
&lt;/h3&gt;

&lt;p&gt;In CloudWatch Logs Insights, all logs are now additionally &lt;strong&gt;grouped in automatically recognized patterns&lt;/strong&gt;. "Patterns" are similar logs that differ only in some values, like dumped variables. Which makes total sense because there is usually only a limited set of different log types your application writes.&lt;/p&gt;

&lt;p&gt;With patterns, it’s much easier to find unusual or alarming logs – instead of traversing hundreds of log messages, you only look at a few aggregated types of log messages. You can click on any of them to see all matching logs. Additionally, you can &lt;strong&gt;compare patterns with different periods&lt;/strong&gt; to see if the number and ratio of log types changed or not.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8X_rMg6j--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/p19a1dmz3rskmhofpjgi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8X_rMg6j--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/p19a1dmz3rskmhofpjgi.png" alt="CloudWatch Logs Insights pattern detection" width="800" height="348"&gt;&lt;/a&gt;&lt;/p&gt;
Example patterns detected on relatively simple Lambda logs



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; It’s one of those features you wonder how you could lived without for so long. And you get this for free (no additional cost apart from the regular price of CloudWatch Logs Insights), which is a cherry on top.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; simply amazed 🤩&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/amazon-cloudwatch-logs-now-offers-automated-pattern-analytics-and-anomaly-detection/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudWatch Logs: anomaly detection
&lt;/h3&gt;

&lt;p&gt;You can now enable anomaly detection on CloudWatch Log Group. After a short learning time, it will automatically, well, detect anomalies in your logs. &lt;strong&gt;Anomalies are based on automatically recognized patterns&lt;/strong&gt; (similar to those described above) &lt;strong&gt;and changes in their frequency&lt;/strong&gt;. When enabled, CloudWatch will detect changes like a decreased number of success logs, an increased number of warning or error logs, or even a lack of logs, meaning your Lambda stopped being invoked (yes, I had to create such alarms in the past).&lt;/p&gt;

&lt;p&gt;You can temporarily or permanently suppress types of findings, which is very useful.&lt;/p&gt;

&lt;p&gt;To get notified of new anomalies, you need to create a CloudWatch Alarm. Apart from that, &lt;strong&gt;anomaly detection is free of charge&lt;/strong&gt; (or rather – included in the data ingestion price).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--zlFIHNfV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/koj5mndvoec13nboi2dm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--zlFIHNfV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/koj5mndvoec13nboi2dm.png" alt="CloudWatch Logs anomaly detection" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;
Example detected anomalies in CloudWatch Logs (&lt;a href="https://aws.amazon.com/blogs/aws/amazon-cloudwatch-logs-now-offers-automated-pattern-analytics-and-anomaly-detection/"&gt;source&lt;/a&gt;)



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AWS announced pattern grouping and anomaly detection in a single blog post, but I think those two are big and important enough to be separate topics. I’ll definitely use both of those features on real production apps sooner rather than later.&lt;/p&gt;

&lt;p&gt;With anomaly detection, &lt;strong&gt;the only problem may be the initial number of false positives&lt;/strong&gt;, especially if you link it to a CloudWatch Alarm. And the fact we get this for no additional charge surprises me - in a good way, of course.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; simply amazed 🤩&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/amazon-cloudwatch-logs-now-offers-automated-pattern-analytics-and-anomaly-detection/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudWatch Logs: query generator
&lt;/h3&gt;

&lt;p&gt;While the CloudWatch Logs Insights &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html"&gt;query syntax&lt;/a&gt; is not overly complicated, I’m not using it frequently enough to learn it by heart like SQL. Therefore, the new query generator that converts natural language expressions like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;find 10 longest Lambda invocation times
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;into:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fields @timestamp, @message 
| filter @type = "REPORT" 
| stats max(@duration) as maxDuration by @logStream 
| sort maxDuration desc 
| limit 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;is really something.&lt;/p&gt;

&lt;p&gt;Currently, the capability is in preview and available only in the &lt;code&gt;us-east-1&lt;/code&gt; and &lt;code&gt;us-west-2&lt;/code&gt; regions, free of charge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; I’m not the biggest fan of Generative AI, simply because everyone is adding it everywhere even if it does not make sense, sometimes worsening instead of improving services. That being said, the query generator for Logs Insights is &lt;strong&gt;an example of AI applied correctly&lt;/strong&gt;. I’m eager to put it to the test in real cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; interested 🧐&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/use-natural-language-to-query-amazon-cloudwatch-logs-and-metrics-preview/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  DynamoDB: zero-ETL OpenSearch integration
&lt;/h3&gt;

&lt;p&gt;DynamoDB can now &lt;strong&gt;automatically ingest items to OpenSearch&lt;/strong&gt;. You need to enable DynamoDB Streams and point-in-time recovery on the table, and the rest – both the initial data load and keeping it in sync – is handled for you. There is just one catch – it requires an &lt;strong&gt;OpenSearch ingest pipeline&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; Ingest from DynamoDB to OpenSearch is &lt;strong&gt;a common pattern&lt;/strong&gt; to get a fast and scalable database with added search capabilities. The built-in integration that does not require a Lambda function in the middle is totally spot on. &lt;strong&gt;The only problem lies in OpenSearch ingest pipeline pricing&lt;/strong&gt;, starting with a small amount of $170/month. While I see it as worthwhile on large applications, it’s unacceptable for serverless development and smaller services. Thus, I’m good with my Lambda functions for now…&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; not even sad, just disappointed 😑&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/amazon-dynamodb-zero-etl-integration-with-amazon-opensearch-service-is-now-generally-available/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  DynamoDB: zero-ETL Redshift integration
&lt;/h3&gt;

&lt;p&gt;Similarly to OpenSearch integration, DynamoDB will be able to keep data in sync in Redshift.&lt;/p&gt;

&lt;p&gt;The capability is now in limited preview, so it’s hard to say more about it. You can &lt;a href="https://pages.awscloud.com/Aurora-Limitless-Database-Preview.html"&gt;sign up for access&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; In my experience, it’s a less common pattern than ingest to OpenSearch, but a direct integration is still warmly welcomed. I just hope that, unlike the OpenSearch ingestion, it won’t introduce extra costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; cautiously interested 🧐&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-dynamodb-zero-etl-integration-redshift/"&gt;announcement&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  AppSync: easier Aurora Data API integration
&lt;/h3&gt;

&lt;p&gt;AppSync JavaScript resolver utils got support for making Aurora Data API requests with &lt;code&gt;createMySQLStatement()&lt;/code&gt; and &lt;code&gt;createPgStatement()&lt;/code&gt; helper functions. The query can be provided as SQL or created with a builder. Additionally, in AppSync, you can now generate the whole API from Aurora Cluster with a few clicks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--09Z7EzvX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sfr43973y32a1c4gvwz1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--09Z7EzvX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sfr43973y32a1c4gvwz1.png" alt="AppSync Data API integration with Amazon Aurora" width="800" height="376"&gt;&lt;/a&gt;&lt;/p&gt;
AppSync can make direct SQL queries… but only to Aurora Serverless v1 (&lt;a href="https://aws.amazon.com/blogs/mobile/build-a-graphql-api-for-your-amazon-aurora-mysql-database-using-aws-appsync-and-the-rds-data-api/"&gt;source&lt;/a&gt;)



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; This sounds good until you realize that &lt;strong&gt;the Data API is still only supported by the Aurora Serverless v1, not v2&lt;/strong&gt;. And that’s a &lt;strong&gt;big limitation&lt;/strong&gt; because v1 does not scale well and is available only with a few not-so-recent Aurora versions. And since you could already make Data API calls from AppSync, this does not introduce any new real capabilities. Don’t get me wrong – helper functions are great, but with no support for Data API from Aurora Serverless v2, they are not worth much…&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; meh 😒&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/mobile/build-a-graphql-api-for-your-amazon-aurora-mysql-database-using-aws-appsync-and-the-rds-data-api/"&gt;announcement post&lt;/a&gt;, &lt;a href="https://docs.aws.amazon.com/appsync/latest/devguide/resolver-reference-rds-js.html"&gt;docs&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ElastiCache Serverless
&lt;/h3&gt;

&lt;p&gt;The new serverless version of ElastiCache is more expensive than the eight cheapest on-demand instance variants. But the pricing model is interesting because, unlike other expensive serverless services, here you don’t pay for always-on idle processing units but only for actual per-request usage. What generates costs is the data storage with a &lt;strong&gt;minimum billable value of 1 GB&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; While better than for OpenSearch Serverless, &lt;strong&gt;the pricing model is still unsuitable for serverless workflows with a $90/month minimum charge&lt;/strong&gt;. DynamoDB is a better key-value storage for most new applications, so unless you migrate an existing system, have particular needs, or calculated significant cost savings, you will be better with DynamoDB by default.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; could be worse 🙄&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/amazon-elasticache-serverless-for-redis-and-memcached-now-generally-available/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  SQS: FIFO throughput increase and DLQ redrive
&lt;/h3&gt;

&lt;p&gt;SQS FIFO queues offer exactly-once processing and strict ordering. Of course, this comes at a price, and this price is limited throughput. While standard queues offer a "nearly unlimited number of transactions per second", FIFO queues always had limits. But now, with &lt;strong&gt;the support for 70,000 transactions per second&lt;/strong&gt; in high throughput mode, you have to try really hard to reach those limits.&lt;/p&gt;

&lt;p&gt;Also, FIFO queues now support the &lt;strong&gt;Dead Letter Queue redrive option&lt;/strong&gt;, allowing the re-delivery of messages that have not been processed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; Sadly, I don’t work on any service requiring such a large throughput. However, it’s worth remembering that it’s possible &lt;strong&gt;only in the high throughput mode&lt;/strong&gt; where messages belong to a uniform distribution of groups, and FIFO order is guaranteed only in the scope of a single group.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; nice 🙂&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/announcing-throughput-increase-and-dead-letter-queue-redrive-support-for-amazon-sqs-fifo-queues/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  S3: Express One Zone storage class
&lt;/h3&gt;

&lt;p&gt;The new S3 storage class, named &lt;strong&gt;Express One Zone&lt;/strong&gt;, is created to handle "&lt;strong&gt;hundreds of thousands of requests per second with consistent single-digit millisecond latency&lt;/strong&gt;". What’s important is that this is a purpose-built storage that works best for small files with a short life span.&lt;/p&gt;

&lt;p&gt;This storage is different enough from other S3 storage classes that it has a separate tab on the S3 page in the AWS Console. Next to the "general purpose buckets", those new Express One Zone buckets are under the "&lt;strong&gt;directory buckets&lt;/strong&gt;" tab. They have a special naming scheme and a new authentication method that grants access tokens valid for 5 minutes. Furthermore, when listing objects, prefixes must be full directory names ending with &lt;code&gt;/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The main use case of Express One Zone buckets is to store intermediate files of data-intensive distributed computing, like AI/ML training. But I’m sure AWS customers will find many other applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Mu6QGbzj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/yey0sif9t1ojgf6qm12b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Mu6QGbzj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/yey0sif9t1ojgf6qm12b.png" alt="S3 Directory buckets" width="800" height="396"&gt;&lt;/a&gt;&lt;/p&gt;
Express One Zone buckets are listed under new "Directory buckets" tab




&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; It’s a specific solution for a specific problem. Before you use it because "it’s faster", "it has directories", and "read and writes cost half the Standard class price", please note &lt;strong&gt;it costs seven times more for storage&lt;/strong&gt; and &lt;strong&gt;does not offer the same durability and availability&lt;/strong&gt;, making it not suitable for long-term or general purpose storage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; nice 🙂&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/new-amazon-s3-express-one-zone-high-performance-storage-class/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenSearch Serverless: vector engine
&lt;/h3&gt;

&lt;p&gt;The vector engine for OpenSearch Serverless, meant as &lt;strong&gt;a database for ML/AI models&lt;/strong&gt;, is now generally available. Interestingly, the OpenSearch and OpenSearch Serverless diverge more and more, with the Serverless version getting unique new features.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; I saw at least five new or adapted vector databases this year, so this one does not surprise me. And all would be good if not for the OpenSearch Serverless costs – &lt;strong&gt;starting with $700/month&lt;/strong&gt;. However, in the announcement post, they write about the possibility of using no active replicas and 0.5 computing units for development, which would land it around $170/month? I’m not entirely sure since there is no mention of it on the pricing page.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; not my area, hard to say 😶&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/vector-engine-for-amazon-opensearch-serverless-is-now-generally-available/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Integrated Application Test Kit
&lt;/h3&gt;

&lt;p&gt;The Integrated Application Test Kit, or IATK, is a new library helping to run tests for serverless applications in the cloud. It has a few useful features, like &lt;strong&gt;resolving resource physical IDs from the CloudFormation stack&lt;/strong&gt; or &lt;strong&gt;waiting for asynchronous EventBridge events&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The library is available in public preview and, for now, only in Python.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; I’m looking forward to library development, both in terms of new features and supported platforms (Node.js in particular). The current capabilities are limited, and a lot will depend on the development team’s direction. But in a good scenario, it can become the base for running integration tests, replacing the helpers I currently write on my own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reaction:&lt;/strong&gt; curious 🤔&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/compute/aws-integrated-application-test-kit/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Not Serverless (but still interesting)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Aurora: Limitless preview
&lt;/h3&gt;

&lt;p&gt;Scaling SQL databases is hard. Scaling SQL databases for writes is especially hard. And this is the problem the new Aurora Limitless Database tackles.&lt;/p&gt;

&lt;p&gt;The Limitless Database uses &lt;strong&gt;data sharding&lt;/strong&gt; to spread the load on multiple instances and transaction routers to manage writes and reads to/from shards. Unlike the read replicas, this horizontal scaling works for both read and write operations. And thanks to the routers, &lt;strong&gt;the sharding is transparent for the client&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The Aurora Limitless is now in a limited preview. You can &lt;a href="https://pages.awscloud.com/Aurora-Limitless-Database-Preview.html"&gt;sign up for access&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_Be-gEjr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6rl4o7dt3v8qg7zs5ys3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_Be-gEjr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/6rl4o7dt3v8qg7zs5ys3.png" alt="Aurora Limitless architecture" width="800" height="416"&gt;&lt;/a&gt;&lt;/p&gt;
Aurora Limitless architecture (&lt;a href="https://aws.amazon.com/blogs/aws/join-the-preview-amazon-aurora-limitless-database/"&gt;source&lt;/a&gt;)



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; While data sharding is not new, having it as a managed AWS service removes a lot of complexity for extremely high-traffic applications requiring SQL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; academically interested 🤓&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/join-the-preview-amazon-aurora-limitless-database/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenSearch: direct S3 queries
&lt;/h3&gt;

&lt;p&gt;You can now connect OpenSearch to S3 and &lt;strong&gt;make queries on the data in the buckets&lt;/strong&gt;. Similarly to Athena, it uses the Glue Data Catalog to represent your S3 data as tables. You can select one of the &lt;strong&gt;three indexing strategies&lt;/strong&gt;, from ingesting only metadata to ingesting all the data from S3 to OpenSearch, which translates to query performance.&lt;/p&gt;

&lt;p&gt;Indexing and making queries consume &lt;strong&gt;compute units&lt;/strong&gt;, which is an additional cost to your OpenSearch cluster. While no compute units are consumed while "no queries or indexing activities are active", I’m not sure how that relates to keeping the S3 index up to date and whether it’s 24/7 activity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; Yes, it looks like Athena with extra steps. However, I see the benefit of making queries through a single engine in a uniform way. Moreover, indexing data in OpenSearch should significantly outperform Athena.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; interested 🧐&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/amazon-opensearch-service-zero-etl-integration-with-amazon-s3-preview/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS SDK for Rust and Kotlin
&lt;/h3&gt;

&lt;p&gt;The Rust and Kotlin AWS SDKs become generally available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; I’m not planning to use any of those two languages in the foreseeable future. But I know the community highly anticipated at least the Rust SDK, so I’m happy for the folks using it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; nice 🙂&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/developer/announcing-general-availability-of-the-aws-sdk-for-rust/"&gt;Rust AWS SDK announcement post&lt;/a&gt;, &lt;a href="https://aws.amazon.com/blogs/developer/aws-sdk-for-kotlin-ga/"&gt;Kotlin AWS SDK announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost Optimization Hub
&lt;/h3&gt;

&lt;p&gt;The Billing and Cost Management pages are now merged into one. I never understood why they were separate, and I had to jump through the links in the first place.&lt;/p&gt;

&lt;p&gt;In this new management page, there is now a Cost Optimization Hub. It &lt;strong&gt;aggregates optimization suggestions&lt;/strong&gt; from over ten different services.&lt;/p&gt;

&lt;p&gt;Using Cost Optimization Hub is free, but you must &lt;strong&gt;opt-in&lt;/strong&gt; to enable it. For best results, you must also opt-in to Compute Optimizer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; While it’s an improvement, I don’t understand why we now have separate &lt;strong&gt;Cost Optimization Hub&lt;/strong&gt; and &lt;strong&gt;Compute Optimizer&lt;/strong&gt;, especially since the first takes the data from the latter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; confused 🤨&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/about-aws/whats-new/2023/11/cost-optimization-hub/"&gt;announcement&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  myApplications
&lt;/h3&gt;

&lt;p&gt;The Console Home page now has myApplications view, where you can create applications and get a dashboard with costs, alarms, security findings, and other widgets just for selected resource groups. You add resources to the applications by &lt;strong&gt;tagging them with a special &lt;code&gt;awsApplication&lt;/code&gt; tag&lt;/strong&gt;. You can choose resources manually or select a CloudFormation stack, and all its resources will be tagged for you, which is nice. But even better is to add the tag to all resources in your stack(s), for example, with the &lt;a href="https://docs.aws.amazon.com/cdk/v2/guide/tagging.html"&gt;CDK Tags aspect&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;There is also an &lt;code&gt;AWS::ServiceCatalogAppRegistry::ResourceAssociation&lt;/code&gt; CloudFormation resource suggested by the myApplications page that you can add to the stack. While I understand that it should associate your stack with the application, it does not work, and myApplications shows that you must add tags anyway.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rjGTsCt6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7fgg9z9bdkujg8gyee9w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rjGTsCt6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7fgg9z9bdkujg8gyee9w.png" alt="myApplications Compute widget" width="800" height="336"&gt;&lt;/a&gt;&lt;/p&gt;
myApplications Compute widget automatically selects and shows the most utilized resource statistics (&lt;a href="https://aws.amazon.com/blogs/aws/new-myapplications-in-the-aws-management-console-simplifies-managing-your-application-resources/"&gt;source&lt;/a&gt;)



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; I know the lack of logical "here is your application and all its resources" groups in AWS is somewhat intimidating for beginners. But I doubt this is helpful for those new users – it took me about 15 minutes to fully understand how it works… Besides, if you follow the basics of best practices to work with AWS, you have &lt;strong&gt;a single application deployed per account&lt;/strong&gt;, so it’s not needed. And I don’t understand why this is a separate thing to Resource Groups instead of an integral part of it…&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; meh 😒&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/new-myapplications-in-the-aws-management-console-simplifies-managing-your-application-resources/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudWatch: Application Signals
&lt;/h3&gt;

&lt;p&gt;In short, it’s &lt;strong&gt;automatic instrumentation and monitoring&lt;/strong&gt; of EKS Clusters. You can also use it for non-EKS applications by running the CloudWatch agent on your ECS Cluster or EC2 instance yourself. Notably, for now, the instrumentation works only for Java applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_7GPVXnQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sa5ijopm1dp334otyxou.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_7GPVXnQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sa5ijopm1dp334otyxou.png" alt="Application Signals services view" width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;
Application Signals services view (&lt;a href="https://aws.amazon.com/blogs/aws/amazon-cloudwatch-application-signals-for-automatic-instrumentation-of-your-applications-preview/"&gt;source&lt;/a&gt;)



&lt;p&gt;Another introduced capability of Application Signals is &lt;strong&gt;Service Level Objective (SLO) monitoring&lt;/strong&gt;. You can monitor one of the discovered services or a CloudWatch Metric and set the target objective. Then, you can tell your customers about the 99.9% uptime of your application based on the actual data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--MfWhp3Tg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gh7j54iwsz65o6425gea.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--MfWhp3Tg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/gh7j54iwsz65o6425gea.png" alt="Applications Signals SLO monitoring" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;
 Applications Signals SLO monitoring (&lt;a href="https://aws.amazon.com/blogs/aws/amazon-cloudwatch-application-signals-for-automatic-instrumentation-of-your-applications-preview/"&gt;source&lt;/a&gt;)



&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; I’ve been lucky enough never to use Kubernetes, and I do not intend to. But I hope my less fortunate colleagues will find Application Signals useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; nice 🙂&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/amazon-cloudwatch-application-signals-for-automatic-instrumentation-of-your-applications-preview/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Q: the AWS AI
&lt;/h2&gt;

&lt;p&gt;I’ve left the "best" for the end. There are multiple levels to unpack here, so bear with me.&lt;/p&gt;

&lt;p&gt;Not surprisingly, the "AI" continued to pop up everywhere during the re:Invent, for better or worse. Undoubtedly, the biggest announcement in this area was Amazon Q, &lt;strong&gt;the AWS response to Large Language Models (LLM)&lt;/strong&gt; taking over the world.&lt;/p&gt;

&lt;p&gt;But what is Q? Well, it’s one thing in four forms.&lt;/p&gt;

&lt;h3&gt;
  
  
  Form 1: generative AI service
&lt;/h3&gt;

&lt;p&gt;The Amazon Q service lets you create &lt;strong&gt;customized LLM working in a ChatGPT style&lt;/strong&gt;, trained on the materials you provide. Those may be, for example, your company knowledge database enhancing Q with domain-specific facts. You can use a few configuration tweaks to tune and improve responses, like restricting irrelevant topics or defining the context for the answers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3ctVGi9L--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/r0gdg051udbeojw189jh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3ctVGi9L--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/r0gdg051udbeojw189jh.png" alt="Amazon Q data sources" width="800" height="1033"&gt;&lt;/a&gt;&lt;/p&gt;
There are plenty of data source connections to make your Q bot smarter (&lt;a href="https://aws.amazon.com/blogs/aws/introducing-amazon-q-a-new-generative-ai-powered-assistant-preview/"&gt;source&lt;/a&gt;)



&lt;p&gt;A side note: extra points for the short and simple service name. Totally unrelated: do you know &lt;strong&gt;how many letters you must write in the AWS Console search bar to get results&lt;/strong&gt; to find the service?&lt;/p&gt;

&lt;h3&gt;
  
  
  Form 2: AWS AI assistant
&lt;/h3&gt;

&lt;p&gt;Without a doubt, you will notice the new popups in the AWS docs and the Console itself with an Amazon Q chat. You can ask it about AWS services and features and hope for the correct answer. It’s the Q service in practice, trained on AWS documentation, showing the service’s capabilities. It’s an excellent example of AWS &lt;a href="https://en.wikipedia.org/wiki/Eating_your_own_dog_food"&gt;dogfooding&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;However, it shows not only the good but also the bad sides of Amazon Q. Like every LLM, it’s prone to hallucinations, as shown on many screenshots circulating on Twitter. See this &lt;a href="https://twitter.com/QuinnyPig/status/1730405664042991928"&gt;thread from Corey Quinn&lt;/a&gt; as an example.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RO1cbQsH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jlwl0k6i61odu6bjye8z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RO1cbQsH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jlwl0k6i61odu6bjye8z.png" alt="Image description" width="730" height="1088"&gt;&lt;/a&gt;&lt;/p&gt;
You can learn a lot of new things from Amazon Q (&lt;a href="https://twitter.com/QuinnyPig/status/1730411650505986441"&gt;source&lt;/a&gt;)



&lt;p&gt;It also &lt;strong&gt;does not know about resources in your account&lt;/strong&gt; and does not make any operations. Thus, asking it about the total number of your Lambda functions will only give you a CLI command to check it yourself.&lt;/p&gt;

&lt;p&gt;It’s also integrated with CodeWhisperer, so you can chat with it without leaving your IDE.&lt;/p&gt;

&lt;h3&gt;
  
  
  Form 3: troubleshooting helper
&lt;/h3&gt;

&lt;p&gt;There are new troubleshooting tools integrated into the AWS Console that use Q. Right now, it can help &lt;strong&gt;find the failed Lambda invocation error root cause&lt;/strong&gt; or debug &lt;strong&gt;VPC connectivity issues&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KsEpwVto--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/z8gcw80dxb1xiyn7d21o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KsEpwVto--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/z8gcw80dxb1xiyn7d21o.png" alt="Lambda troubleshooting with Amazon Q" width="800" height="678"&gt;&lt;/a&gt;&lt;/p&gt;
If identifying IAM permissions problems would be the only Amazon Q capability, it would be the most used AWS service anyway (&lt;a href="https://aws.amazon.com/blogs/aws/amazon-q-brings-generative-ai-powered-assistance-to-it-pros-and-developers-preview/"&gt;source&lt;/a&gt;)




&lt;h3&gt;
  
  
  Form 4: context-aware query generator
&lt;/h3&gt;

&lt;p&gt;In Redshift, you can now &lt;strong&gt;write your question in a natural language and get SQL matching your tables&lt;/strong&gt; in a response. That’s actually pretty awesome. I hope similar functionalities will also land in other services.&lt;/p&gt;

&lt;h3&gt;
  
  
  It's still the preview
&lt;/h3&gt;

&lt;p&gt;Everyone expects the best from AWS, but Amazon Q and all the capabilities it powers are still in the preview, so some hiccups are understandable. I’m sure AWS will fine-tune it to reduce hallucinations and give better answers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My opinion:&lt;/strong&gt; Finding the appropriate documentation page in Google takes me less time than waiting for the chatbot’s answer. But I’ll give it a go. For now, I like that Amazon Q’s answers include the links to the documentation sources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My reaction:&lt;/strong&gt; mixed feelings 😵‍💫&lt;/p&gt;

&lt;p&gt;See more: &lt;a href="https://aws.amazon.com/blogs/aws/introducing-amazon-q-a-new-generative-ai-powered-assistant-preview/"&gt;announcement post&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Wow, that turned out long.&lt;/p&gt;

&lt;p&gt;Recordings from re:Invent are available on YouTube: &lt;a href="https://www.youtube.com/playlist?list=PL2yQDdvlhXf_yTJdRlfK7K1ARdhYHhUvR"&gt;keynotes&lt;/a&gt;, &lt;a href="https://www.youtube.com/playlist?list=PL2yQDdvlhXf-5R7VtNr9P4nosA7DiDtM1"&gt;hundreds of presentations&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Please let me know in the comments if there is anything you liked the most from the announcements!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Avoiding and solving CDK resource name conflicts</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Mon, 30 Jan 2023 16:00:00 +0000</pubDate>
      <link>https://forem.com/aws-builders/avoiding-and-solving-cdk-resource-name-conflicts-3hen</link>
      <guid>https://forem.com/aws-builders/avoiding-and-solving-cdk-resource-name-conflicts-3hen</guid>
      <description>&lt;p&gt;CDK generates Logical IDs used by the CloudFormation to track and identify resources. In this post, I'll explain what Logical IDs are, how they're generated, and why they're important. Understanding this will help you avoid unexpected resource deletions and baffling "resource already exists" errors during deployment.&lt;/p&gt;

&lt;p&gt;CDK provides an abstraction layer over the CloudFormation, which is used under the hood. With CDK, Infrastructure as Code is easier and more secure. But to use CDK effectively, you still need to understand how CloudFormation works. Failing to do so can have dire consequences, like the accidental removal of all your production database data. And we don't want that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Construct ID vs. Logical ID vs. Physical ID
&lt;/h2&gt;

&lt;p&gt;Let's create a simple Stack with one &lt;strong&gt;Construct&lt;/strong&gt; - an SQS Queue. For the Construct ID, the second parameter in the constructor, we set &lt;code&gt;MyQueue&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Stack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib/aws-sqs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;MyStack&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyQueue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After running &lt;code&gt;cdk deploy&lt;/code&gt; we get a CloudFormation stack with a single &lt;strong&gt;resource&lt;/strong&gt;. The generated CloudFormation template looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;MyQueueE6CA6235&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::SQS::Queue&lt;/span&gt;
    &lt;span class="na"&gt;UpdateReplacePolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Delete&lt;/span&gt;
    &lt;span class="na"&gt;DeletionPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Delete&lt;/span&gt;
    &lt;span class="na"&gt;Metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="s"&gt;aws:cdk:path: MyStack/MyQueue/Resource&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The template contains a resource &lt;code&gt;AWS::SQS::Queue&lt;/code&gt; with Logical ID &lt;code&gt;MyQueueE6CA6235&lt;/code&gt;. As you can see, the Logical ID is the Construct ID we provided, with an extra suffix added by the CDK.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CDK Constructs&lt;/strong&gt; relate to &lt;strong&gt;CloudFormation resources&lt;/strong&gt; in a one-to-many relationship. A single CDK Construct can create one or more CloudFormation resources. In this example, the Queue Construct creates a single &lt;code&gt;AWS::SQS::Queue&lt;/code&gt; resource.&lt;/p&gt;

&lt;p&gt;Yet another thing is the resource name or the &lt;strong&gt;Physical ID&lt;/strong&gt;. If you go to the SQS page in AWS Console, you will find a queue with a name like &lt;code&gt;MyStack-MyQueueE6CA6235-86lqOs0JG5ZC&lt;/code&gt;. It's the name auto-generated by the CloudFormation, consisting of the stack name, resource Logical ID, and a random suffix added by the CloudFormation for uniqueness. This will turn out important further down the road, so read on.&lt;/p&gt;

&lt;p&gt;For now, we have three IDs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the Construct ID that we set in the CDK code (&lt;code&gt;MyQueue&lt;/code&gt;),&lt;/li&gt;
&lt;li&gt;the Logical ID generated by the CDK and put in the CloudFormation template (&lt;code&gt;MyQueueE6CA6235&lt;/code&gt;),&lt;/li&gt;
&lt;li&gt;the Physical ID (resource name) generated by the CloudFormation (&lt;code&gt;MyStack-MyQueueE6CA6235-86lqOs0JG5ZC&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Additionally, the Physical ID is part of the ARN (Amazon Resource Name) used by the clients to make API calls to the resource. The Logical ID is important for the CloudFormation, but the Physical ID is necessary for resource clients.&lt;/p&gt;

&lt;h2&gt;
  
  
  How CloudFormation tracks resources
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;CloudFormation identifies resources by their Logical IDs.&lt;/strong&gt; If we change the Logical ID in the CloudFormation template, the CloudFormation sees it as two changes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;removal of the old resource,&lt;/li&gt;
&lt;li&gt;and creation of the new one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In CloudFormation terms, this is called &lt;strong&gt;replacing&lt;/strong&gt; the resource.&lt;/p&gt;

&lt;p&gt;This behavior is described in the &lt;a href="https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-get-template.html"&gt;CloudFormation documentation&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;For most resources, changing the logical name of a resource is equivalent to deleting that resource and replacing it with a new one. Any other resources that depend on the renamed resource also need to be updated and might cause them to be replaced.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The simplest way to provoke it is to change the Construct ID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Stack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib/aws-sqs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;MyStack&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyRenamedQueue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we run &lt;code&gt;cdk deploy&lt;/code&gt;, CloudFormation will first create a new SQS queue and only then remove the old one.&lt;/p&gt;

&lt;p&gt;The order of operations is essential here - &lt;strong&gt;CloudFormation will first create new resources, and only after that succeeds will it remove the old ones&lt;/strong&gt;. This minifies downtime and prevents the removal of existing resources if something goes wrong during the update and needs to be rolled back.&lt;/p&gt;

&lt;p&gt;Old resources are removed in the &lt;a href="https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-view-stack-data-resources.html"&gt;UPDATE_COMPLETE_CLEANUP_IN_PROGRESS phase&lt;/a&gt;, which is described as follows:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Ongoing removal of old resources for one or more stacks after a successful stack update. For stack updates that require resources to be replaced, CloudFormation creates the new resources first and then deletes the old resources to help reduce any interruptions with your stack. In this state, the stack has been updated and is usable, but CloudFormation is still deleting the old resources.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the example above, we changed the Construct ID (and, therefore, the Logical ID), and the update went smoothly. But it's not always the case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dangers of replacing CloudFormation resources
&lt;/h2&gt;

&lt;p&gt;By changing the CloudFormation resource Logical ID, we removed the existing SQS queue and created a new one. That's a dangerous thing to do in the production environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Losing production data by accident
&lt;/h3&gt;

&lt;p&gt;What if the queue had messages that were not yet processed? We would lose them.&lt;/p&gt;

&lt;p&gt;If instead of an SQS queue, it would be a DynamoDB, RDS, or any other database - we would replace it with a fresh, empty one.&lt;/p&gt;

&lt;p&gt;It's also not good when dealing with stateless resources like Lambda functions. By replacing one resource with another, we lose the metrics continuity.&lt;/p&gt;

&lt;h3&gt;
  
  
  CloudFormation resource already exists error
&lt;/h3&gt;

&lt;p&gt;Losing data is not the only potential problem. Sometimes, the CloudFormation may not make the update at all, telling us the resource we want to create already exists.&lt;/p&gt;

&lt;p&gt;Let's modify the first version of our Stack and add the &lt;code&gt;queueName&lt;/code&gt; property. This corresponds to the queue Physical ID. Previously, CloudFormation generated that name for us, keeping it unique by adding a random suffix. Now, we hardcode it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Stack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib/aws-sqs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;constructs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;MyStack&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyQueue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;queueName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;my-queue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we deploy the stack now and then do the same as before - change the Construct ID from &lt;code&gt;MyQueue&lt;/code&gt; to &lt;code&gt;MyRenamedQueue&lt;/code&gt;, leaving the &lt;code&gt;queueName&lt;/code&gt; as it is, updating the CloudFormation stack will fail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE_FAILED | AWS::SQS::Queue | MyRenamedQueue
Resource handler returned message: "Resource of type 'AWS::SQS::Queue' with identifier 'my-queue' already exists." (RequestToken: 557cc5a2-5e53-feb7-1d7e-63d41aed398f, HandlerErrorCode: AlreadyExists)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why is that?&lt;/p&gt;

&lt;p&gt;The queue name must be unique on a given AWS account in a given region. It's similar for Lambda functions, DynamoDB tables, and, frankly, most other AWS resources.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;But wait!&lt;/em&gt;, you may say. We did not declare a second SQS queue with the same name. Our stack still contains a single queue.&lt;/p&gt;

&lt;p&gt;But let's look at the order of operations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We create a CDK Construct with ID &lt;code&gt;MyQueue&lt;/code&gt; and name &lt;code&gt;my-queue&lt;/code&gt;

&lt;ol&gt;
&lt;li&gt;CloudFormation creates a queue with Logical ID &lt;code&gt;MyQueueE6CA6235&lt;/code&gt; (suffix added by the CDK) named &lt;code&gt;my-queue&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;
&lt;li&gt;We change the CDK Construct ID from &lt;code&gt;MyQueue&lt;/code&gt; to &lt;code&gt;MyRenamedQueue&lt;/code&gt;

&lt;ol&gt;
&lt;li&gt;CloudFormation sees it as the removal of &lt;code&gt;MyQueueE6CA6235&lt;/code&gt; and creation of &lt;code&gt;MyRenamedQueue5E166F18&lt;/code&gt; (suffix added by the CDK)&lt;/li&gt;
&lt;li&gt;Firstly, it tries to create the new queue &lt;code&gt;MyRenamedQueue5E166F18&lt;/code&gt; named &lt;code&gt;my-queue&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Creation fails - queue with name &lt;code&gt;my-queue&lt;/code&gt; already exists&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;How to fix it? There are two ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Restore the original Construct ID. However, as we will see in a moment, it may not always be possible if we refactor the code.&lt;/li&gt;
&lt;li&gt; Comment out the Construct, re-deploy the Stack (so the old resource is removed), uncomment the Construct, and re-deploy again to create the new resource.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Preventing CloudFormation resources replacement
&lt;/h2&gt;

&lt;p&gt;Okay, so to prevent all those problems, is it enough to not set the resource names by hand and not modify the Construct IDs? Well, unfortunately, it's not that simple.&lt;/p&gt;

&lt;h3&gt;
  
  
  Letting CloudFormation generate unique names
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The best practice is to let CloudFormation generate unique resource names instead of hardcoding them.&lt;/strong&gt; This has two benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;we prevent the errors like the one described above,&lt;/li&gt;
&lt;li&gt;we can deploy multiple instances of the same CloudFormation stack on the same account, for example, to create various environments of our service.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(The latter can also be achieved with resource names set by hand by adding the environment name to the resource name.)&lt;/p&gt;

&lt;p&gt;But sometimes, using auto-generated names is not suitable. From my experience, the "hardcoded" names are better:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;for resources shared with other AWS accounts (for example, if a service in another AWS account pushes messages directly to our SQS queue) because if we remove and re-create the stack, the resource ARN will not change, and no update of external clients will be needed,&lt;/li&gt;
&lt;li&gt;for resources like Glue Tables, where a nice and short name is much better to use in Athena queries, and it needs to be unique only in the scope of the Glue Database.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Not changing CDK Construct IDs
&lt;/h3&gt;

&lt;p&gt;But as we discussed earlier, replacing resources is likely not the best thing to do in the first place. So to prevent it, we just don't modify the CDK Construct IDs. Simple enough, right?&lt;/p&gt;

&lt;p&gt;Well, you can guess it - not really.&lt;/p&gt;

&lt;p&gt;Let's look again at our simple Stack with a Queue Construct:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Stack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib/aws-sqs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;constructs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;MyStack&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyQueue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's say that as our service grows, we add more SQS queues and always need them to have dead-letter queues (DLQ) configured. So instead of repeating ourselves, we extract it into a separate Construct. Remember, &lt;strong&gt;Constructs are abstract CDK building blocks&lt;/strong&gt; you can nest, and each Construct may create one or more CloudFormation resources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Stack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib/aws-sqs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;constructs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;MyStack&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;StackProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;MyQueueWithDLQ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyQueueWithDLQ&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;MyQueueWithDLQ&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Construct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dlq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DLQ&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyQueue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;deadLetterQueue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;maxReceiveCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;dlq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We've moved the Queue from &lt;code&gt;MyStack&lt;/code&gt; to &lt;code&gt;MyQueueWithDLQ&lt;/code&gt; Construct. But the Queue ID stays the same - it's still &lt;code&gt;MyQueue&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If we re-deploy the stack now, we will see two new queues created:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;MyCustomQueueMyQueue20F468EB&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MyCustomQueueDLQE6D3019E&lt;/code&gt;,&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;and the existing one removed.&lt;/p&gt;

&lt;p&gt;Why is that?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CDK generates the Logical IDs based on the full Construct "path".&lt;/strong&gt; With nested Constructs, IDs of all "higher" Constructs are used to create the unique Logical ID. So when the path changed from &lt;code&gt;MyQueue&lt;/code&gt; to &lt;code&gt;MyCustomQueue/MyQueue&lt;/code&gt;, the generated Logical ID changed from &lt;code&gt;MyQueueE6CA6235&lt;/code&gt; to &lt;code&gt;MyCustomQueueMyQueue20F468EB&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;So even if we don't change the Construct IDs, moving Constructs into other Constructs changes generated Logical IDs. This is what often happens during development or refactoring.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pinning Logical IDs during CDK refactoring
&lt;/h3&gt;

&lt;p&gt;Thankfully, we can still refactor our CDK code while preventing changes to resources' Logical IDs.&lt;/p&gt;

&lt;p&gt;To do so, we can override the Logical ID, setting it by hand instead of letting CDK generate it. Of course, it's not recommended to do it ahead of time, but when we refactor the code and want to move the existing Construct. Then, we can check the current Logical ID and "pin" it so it won't be changed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;CfnQueue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-cdk-lib/aws-sqs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyRenamedQueue&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;defaultChild&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;CfnQueue&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;overrideLogicalId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyQueueE6CA6235&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;I hope this post clarifies how CDK and CloudFormation track resources and makes it less confusing.&lt;/p&gt;

&lt;p&gt;What's important is that &lt;strong&gt;the CloudFormation identifies the resources by the Logical ID&lt;/strong&gt;, not the name or any other property. So if you change the Logical ID, the new resource is created, and then the old one is removed.&lt;/p&gt;

&lt;p&gt;Replacing resources with new ones is usually safe in development environments but dangerous in production, where it can cause us to lose data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CDK generates the Logical IDs from the Construct ID.&lt;/strong&gt; If you have nested Constructs, all higher Construct IDs are used to generate the Logical ID. Moving a Construct into another Construct changes its Logical ID.&lt;/p&gt;

&lt;p&gt;When we refactor the CDK code and want to move the Construct without causing the resource to be replaced, we can pin down the current Logical ID.&lt;/p&gt;

&lt;p&gt;A particularly nasty problem is changing the Logical ID of a resource with a hardcoded name. CloudFormation will first try to create the new resource and fail because the resource with the same name already exists. The solution is to either revert to the previous Logical ID or to temporarily remove the Construct from the Stack, re-deploy to remove the old resource, restore the Construct, and re-deploy again.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>cdk</category>
      <category>serverless</category>
      <category>cloudformation</category>
    </item>
    <item>
      <title>Top 12 Serverless Announcements from re:Invent 2022</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Wed, 07 Dec 2022 09:07:00 +0000</pubDate>
      <link>https://forem.com/aws-builders/top-12-serverless-announcements-from-reinvent-2022-3mgh</link>
      <guid>https://forem.com/aws-builders/top-12-serverless-announcements-from-reinvent-2022-3mgh</guid>
      <description>&lt;p&gt;re:Invent 2022, the annual AWS conference in Las Vegas, is now behind us. I did not attend in person, but that gave me time to consolidate this list of top new serverless features while everyone else is sleeping off the intense 5-day conference. And I envy them just a little.&lt;/p&gt;

&lt;h2&gt;
  
  
  pre:Invent
&lt;/h2&gt;

&lt;p&gt;"pre:Invent" is a few weeks before the actual conference. You can always see an increased number of features and improvements releases in that period.&lt;/p&gt;

&lt;p&gt;Here are my favorite picks.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔑 Multiple MFA devices in IAM
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/security/you-can-now-assign-multiple-mfa-devices-in-iam/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally.&lt;/p&gt;

&lt;p&gt;You could already set up &lt;strong&gt;Multi-Factor Authentication&lt;/strong&gt; for IAM users and the account root user. But until now, you were limited to 1 MFA device only. This was not perfect. If the device is lost or destroyed, you could get blocked from the account.&lt;/p&gt;

&lt;p&gt;But it's no issue anymore. Now you can assign up to 8 MFA devices, which can be a:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;virtual MFA device - like the &lt;a href="https://authy.com/features/setup/"&gt;Authy&lt;/a&gt; app&lt;/li&gt;
&lt;li&gt;FIDO security key - such as &lt;a href="https://www.yubico.com/products/"&gt;YubiKey&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;hardware TOTP token&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you don't have MFA enabled yet, especially for your AWS account root user - it's about time. Virtual MFA is easy and free to set up. On the other hand, FIDO is more secure, although it requires having a security key. Good news - you may be eligible for &lt;a href="https://aws.amazon.com/security/amazon-security-initiatives/free-mfa-security-key/"&gt;a free YubiKey from AWS&lt;/a&gt; if you are from the US.&lt;/p&gt;

&lt;p&gt;Yes, this is not serverless per se, but it's too important to omit.&lt;/p&gt;

&lt;h3&gt;
  
  
  🏃‍♂️ Lambda Node.js 18 runtime
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/compute/node-js-18-x-runtime-now-available-in-aws-lambda/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;18.x is the currently active LTS version of Node.js. As every version, it comes with various new features and improvements. One of the most significant is the &lt;strong&gt;Fetch API&lt;/strong&gt;, bringing the well-known &lt;code&gt;fetch()&lt;/code&gt; function from the browsers to the backend, eliminating the need for third-party packages to make HTTP requests (or at least to make them easily). While still experimental, the Fetch API is available by default in Node 18.&lt;/p&gt;

&lt;p&gt;But, maybe even more importantly, &lt;strong&gt;Node.js 18.x Lambda runtime comes with AWS SDK v3 included&lt;/strong&gt;. That replaces AWS SDK v2, which was available in the previous runtime versions. Now, while using the new SDK v3, you can omit it from your code bundle to reduce its size since the SDK is already available in the runtime. That's not my favorite practice, but I know many folks are doing so.&lt;/p&gt;

&lt;p&gt;However, there are &lt;a href="https://github.com/aws/aws-lambda-base-images/issues/47#issuecomment-1327249479"&gt;reports of increased cold starts with Node 18 runtime&lt;/a&gt; versus Node 16. Hopefully, the Lambda team will improve this soon.&lt;/p&gt;

&lt;p&gt;If you are using the AWS JS SDK v3, the best way to mock it for unit tests is to use the &lt;a href="https://github.com/m-radzikowski/aws-sdk-client-mock"&gt;aws-sdk-client-mock&lt;/a&gt; library.&lt;/p&gt;

&lt;h3&gt;
  
  
  ⏰ EventBridge Scheduler
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/compute/introducing-amazon-eventbridge-scheduler/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The new capability of the EventBridge allows scheduling tasks to execute. But wait, we already had CloudWatch Events, later transformed into EventBridge scheduled rules. So what's new here, you may ask?&lt;/p&gt;

&lt;p&gt;Well, the new EventBridge Scheduler is much more powerful. For instance, &lt;strong&gt;it integrates with hundreds of AWS services&lt;/strong&gt;, allowing you to make thousands of API calls directly without a Lambda function.&lt;/p&gt;

&lt;p&gt;But the most distinct feature is &lt;strong&gt;one-time schedules&lt;/strong&gt;. Until now, setting up singular actions to be executed in the future involved architecture patterns with DynamoDB and TTL or periodic status checking. Now, you can offload this to the EventBridge.&lt;/p&gt;

&lt;p&gt;The Scheduler comes with a soft limit of 1 million scheduled tasks, high-throughput, and configurable time windows for distributing the load. The only drawback is that the one-time tasks are not automatically deleted and count into the scheduled tasks limit. However, the responsible team is &lt;a href="https://twitter.com/pinskinator/status/1594784355003830272"&gt;working on improving it soon&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  📨 EventBridge suffix, case-insensitive, and OR matching
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2022/11/amazon-eventbridge-enhanced-filtering-capabilities/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Continuing with EventBridge, content-based event filtering has new capabilities. Now you can filter by a &lt;code&gt;suffix&lt;/code&gt; - this was a highly requested feature, with one use-case being &lt;strong&gt;filtering S3 object events by the file extension&lt;/strong&gt;. There is also a new &lt;code&gt;equals-ignore-case&lt;/code&gt; condition and an &lt;code&gt;$or&lt;/code&gt; directive to match if any of the provided conditions match.&lt;/p&gt;

&lt;p&gt;See the &lt;a href="https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-event-patterns-content-based-filtering.html"&gt;documentation&lt;/a&gt; for the description of all filters.&lt;/p&gt;

&lt;h3&gt;
  
  
  🚀 AppSync JavaScript Resolvers
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/aws/aws-appsync-graphql-apis-supports-javascript-resolvers/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This was &lt;a href="https://github.com/aws/aws-appsync-community/issues/147"&gt;the top-voted, long-awaited request for AppSync&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Resolvers are code snippets that integrate between AppSync and other services. They are used to prepare the request and parse the response. Until now, you had to write them in VTL (Apache Velocity Templates) - a format beloved by developers. If they would not love it, why would they spend so much time writing VTLs, right?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JavaScript Resolvers are the new default in AppSync.&lt;/strong&gt; However, they come with several limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/appsync/latest/devguide/core-features.html"&gt;not whole JavaScript syntax is supported&lt;/a&gt;,&lt;/li&gt;
&lt;li&gt;asynchronous operations are not supported,&lt;/li&gt;
&lt;li&gt;there is no external network connectivity,&lt;/li&gt;
&lt;li&gt;code must be a single file under 32 KB in size.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thus even in JavaScript, they are still AppSync Resolvers. Their role is to prepare payloads the AppSync will pass on. They are not a replacement for Lambda functions for more complex operations.&lt;/p&gt;

&lt;p&gt;Still, this is a great improvement. With JavaScript, writing and testing Resolvers will be much easier. And, of course, you can use TypeScript and transpile it to JS!&lt;/p&gt;

&lt;h3&gt;
  
  
  🧩 Cross-account access in Step Functions
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/compute/introducing-cross-account-access-capabilities-for-aws-step-functions/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Step Functions Task steps can now assume provided IAM roles and access resources on other AWS accounts directly.&lt;/p&gt;

&lt;p&gt;Until now, to access another account, you needed a Lambda function that would assume a cross-account role. &lt;strong&gt;Now you just provide the role ARN in the Task definition, and Step Function assumes it.&lt;/strong&gt; This way, you can make any API call to any service on a different account (with a role that gives you access to it, of course).&lt;/p&gt;

&lt;h2&gt;
  
  
  re:Invent
&lt;/h2&gt;

&lt;p&gt;Of course, the biggest announcements were on the re:Invent itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  🚰 EventBridge Pipes
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/aws/new-create-point-to-point-integrations-between-event-producers-and-consumers-with-amazon-eventbridge-pipes/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;EventBridge Pipes are here to make your Lambdas obsolete.&lt;/p&gt;

&lt;p&gt;Pipes are triggered by &lt;strong&gt;events from various sources&lt;/strong&gt;, just like Lambda functions. Then you can filter, &lt;strong&gt;enrich and transform&lt;/strong&gt; the incoming events. Finally, you &lt;strong&gt;send them to a target&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That flow describes a lot of Lambda functions I wrote. With Pipes, it's simple, low-code, reliable, and effective.&lt;/p&gt;

&lt;p&gt;At the moment of the initial release, Pipes support DynamoDB Streams, Kinesis Streams, SQS, MSK, and MQ as event sources. You can use Lambdas, Step Functions, or API calls for enriching events. Finally, Pipes can send events to 15 target destinations, including EventBridge buses, APIs, Kinesis Streams, Kinesis Firehose, SNS, SQS, Step Functions, Lambdas, and more.&lt;/p&gt;

&lt;p&gt;And all those features cost just $0.40 per million invocations (after filtering!). For comparison, it's the same price as for the SQS requests. Furthermore, you can optimize it by batching the input events.&lt;/p&gt;

&lt;h3&gt;
  
  
  🪣 Step Functions Distributed Map
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/aws/step-functions-distributed-map-a-serverless-solution-for-large-scale-parallel-data-processing/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Step Functions are great for data processing. But there is a limited size of the payload you can pass between the next steps, as well as limited parallelism that affects the performance for larger jobs. This makes processing files still very dependent on Lambda functions.&lt;/p&gt;

&lt;p&gt;Well, no more.&lt;/p&gt;

&lt;p&gt;The new flavor of the Map state, the Distributed Map, is here to &lt;strong&gt;orchestrate large-scale processing jobs directly in the Step Functions&lt;/strong&gt;, focusing on S3 files. It can read a JSON or CSV file from S3 and iterate over individual records. Or, even better, it can list files from the S3 location on its own and iterate over them. Then, for processing the records or files, it starts separate child workflows with up to 10,000 parallel executions. And to optimize the work, it can process in batches (with a single child workflow getting multiple records/files as input).&lt;/p&gt;

&lt;h3&gt;
  
  
  🫰 Lambda SnapStart for Java
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/aws/new-accelerate-your-lambda-functions-with-lambda-snapstart/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Java is known for long cold starts on Lambda. And even though I say cold starts are not a big problem in most cases, I mean it when the initialization takes 0.5-1 second. With Java, it's often above 5 seconds, which is a whole different story.&lt;/p&gt;

&lt;p&gt;Probably that's why AWS decided to tackle the issue, starting with Java first. With the new SnapStart feature, &lt;strong&gt;the function initialization happens during the deployment&lt;/strong&gt;. Then the disk and memory state of the initialized environment are cached. So when you invoke the function, &lt;strong&gt;the environment is restored from the cache in under 200 ms&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I'm unlikely to write any Lambda function in Java. However, I'm hoping the SnapStart will also become available on other runtimes. If (or when) it comes to Python and Docker, it will be a game changer for &lt;a href="https://betterdev.blog/serverless-ml-on-aws-lambda/"&gt;serverless Machine Learning solutions&lt;/a&gt;, which also suffer from long cold starts.&lt;/p&gt;

&lt;h3&gt;
  
  
  🕵️‍♂️ Inspector support for Lambda
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/aws/amazon-inspector-now-scans-aws-lambda-functions-for-vulnerabilities/"&gt;(announcement post)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Amazon Inspector is a service that scans software libraries against known security vulnerabilities. It does not require installing any additional dependencies or agents. And after EC2 and ECR, it now supports Lambda functions.&lt;/p&gt;

&lt;p&gt;You just enable the Inspector in the AWS Console. &lt;strong&gt;Then it automatically and continuously scans all the Lambda functions on the account.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;How much does security costs? $0.30/Lambda/month.&lt;/p&gt;

&lt;p&gt;Should you enable it right away on the production account? Probably yes, unless you already have dependency vulnerabilities scanning in place (like GitHub Dependabot or Snyk).&lt;/p&gt;

&lt;h2&gt;
  
  
  no:Invent
&lt;/h2&gt;

&lt;p&gt;Unfortunately, there were some disappointments as well.&lt;/p&gt;

&lt;h3&gt;
  
  
  💸 OpenSearch "Serverless"
&lt;/h3&gt;

&lt;p&gt;One of the promises of serverless is no-use, no-pay pricing. AWS themselves said it &lt;a href="https://aws.amazon.com/blogs/industries/how-sgk-reduced-operating-costs-by-83-with-noops-serverless-microservices/"&gt;multiple&lt;/a&gt; &lt;a href="https://aws.amazon.com/blogs/publicsector/scaling-zero-serverless-way-future-university-of-york/"&gt;times&lt;/a&gt; in the past.&lt;/p&gt;

&lt;p&gt;But this year, AWS decided to break that promise. In my opinion - for marketing purposes, because "serverless" is trending now.&lt;/p&gt;

&lt;p&gt;So after MSK "Serverless", Aurora "Serverless" v2, and Neptune "Serverless", now we got OpenSearch "Serverless".&lt;/p&gt;

&lt;p&gt;The problem with all of them? They do not scale down to zero. Therefore, you will pay a minimum fee for created instances, even if not used at all.&lt;/p&gt;

&lt;p&gt;How much? &lt;strong&gt;Almost $700/month for the OpenSearch "Serverless".&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Why is that a problem? I'm glad you ask. &lt;a href="https://www.lastweekinaws.com/blog/no-aws-aurora-serverless-v2-is-not-serverless/"&gt;I wrote about this after the Aurora "Serverless" v2 release.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And don't get me wrong. The auto-scaling offer of all those services is a wonderful thing. I also understand it's not easy to make a database that will scale down to zero and then scale up to handle incoming requests with no additional latency. My only problem lies in the misleading naming.&lt;/p&gt;

&lt;h3&gt;
  
  
  🏅 No Serverless Specialty Certificate
&lt;/h3&gt;

&lt;p&gt;Despite all the marketing around the serverless, there is still no Serverless Specialty &lt;a href="https://betterdev.blog/how-to-pass-aws-certification-exams/"&gt;AWS certificate&lt;/a&gt;. While the serverless solutions are part of the Associate and Professional certificate exams, they make up only about 10% of the questions. A certificate that proves knowledge of modern, serverless architectures and solutions without EC2 machines and complex network routing is something the community eagerly awaits.&lt;/p&gt;

&lt;p&gt;But we got a consolidation prize - &lt;a href="https://aws.amazon.com/blogs/compute/introducing-new-aws-serverless-digital-learning-badges/"&gt;a Serverless Learning Path in the AWS Skill Builder&lt;/a&gt;. It's a free, self-paced, online course where you can earn a badge on completion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Notable mentions
&lt;/h2&gt;

&lt;p&gt;There were many, many more releases this year on the re:Invent, around the serverless and not.&lt;/p&gt;

&lt;p&gt;You can now &lt;a href="https://aws.amazon.com/about-aws/whats-new/2022/11/manage-resources-aws-organizations-cloudformation/"&gt;manage your AWS Organization through CloudFormation&lt;/a&gt;, including creating accounts, organizational units, and policies. It's one of those things you are surprised were not already possible. However, I will stick to the &lt;a href="https://github.com/org-formation/org-formation-cli"&gt;OrgFormation&lt;/a&gt; for my own accounts, as it offers additional features like deploying stacks and performing custom logic across the organization.&lt;/p&gt;

&lt;p&gt;AWS Glue, a service I'm not a big fan of personally, announced &lt;a href="https://aws.amazon.com/about-aws/whats-new/2022/11/introducing-aws-glue-4-0/"&gt;version 4.0&lt;/a&gt; and several new capabilities.&lt;/p&gt;

&lt;p&gt;SageMaker, &lt;a href="https://betterdev.blog/serverless-ml-on-aws-lambda/#why_not_sagemaker_serverless"&gt;already bloated with features&lt;/a&gt;, got at least a dozen more.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/blogs/compute/visualize-and-create-your-serverless-workloads-with-aws-application-composer/"&gt;Application Composer&lt;/a&gt; is a new visual tool for designing serverless applications. After the Step Functions Workflow Studio, it's another drag-and-drop solution suggesting that AWS wants to improve on the Developer Experience field. However, I doubt I will use it myself. I don't believe in a drag-and-drop application design. And it integrates with SAM for IaC while I'm on the team CDK.&lt;/p&gt;

&lt;p&gt;However, I'm looking forward to learning more about &lt;a href="https://aws.amazon.com/about-aws/whats-new/2022/11/amazon-verified-permissions-preview/"&gt;Amazon Verified Permissions&lt;/a&gt;, which is now in a closed preview. From my understanding, it will allow you to offload application permission management to AWS. I'll definitely give it a try.&lt;/p&gt;

&lt;h2&gt;
  
  
  Direction of serverless
&lt;/h2&gt;

&lt;p&gt;AWS Lambda is no longer a necessary element of serverless applications. More and more solutions can exclusively rely on low-code services like AppSync with its direct integrations (now connected with JavaScript), EventBridge (now with Pipes), or Step Functions (now with built-in file processing). If Lambda functions are used, their role is reduced.&lt;/p&gt;

&lt;p&gt;And that's a great thing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code is a liability.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;By moving the standard and repeatable tasks to the platform, we can innovate faster. There is less code to write, test, and maintain. Less code also means a lower risk of bugs.&lt;/p&gt;

&lt;p&gt;And that's the idea of serverless. Fewer things for us, developers, to manage. More focus on what matters for the business.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sessions recordings
&lt;/h2&gt;

&lt;p&gt;AWS re:Invent is not only about exciting new launches. It's also a lot of tech sessions.&lt;/p&gt;

&lt;p&gt;Sure, some talks are just brand marketing. But many technical presentations are given by the best people in the industry who built the solutions you are using. Those sessions are on all levels of advancement.&lt;/p&gt;

&lt;p&gt;Their recordings are available on this lengthy playlist (over 440 videos at this moment!): &lt;a href="https://www.youtube.com/playlist?list=PL2yQDdvlhXf_hIzmfHCdbcXj2hS52oP9r"&gt;AWS re:Invent 2022 sessions&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>conference</category>
    </item>
    <item>
      <title>How to pass AWS Certification exams</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Wed, 23 Nov 2022 17:48:19 +0000</pubDate>
      <link>https://forem.com/aws-builders/how-to-pass-aws-certification-exams-9ch</link>
      <guid>https://forem.com/aws-builders/how-to-pass-aws-certification-exams-9ch</guid>
      <description>&lt;p&gt;I've never cared too much about certificates, apart from the SSL ones (haha). And yet I passed 7 AWS exams. Why? How to prepare? How to pass? How to pay only 50% for the exam? I answer all this and more in this post.&lt;/p&gt;

&lt;p&gt;After passing both Professional-level exams, the DevOps Engineer and the Solutions Architect, I shared my thoughts on them on Twitter. People were interested, so this post extends those tweets with content universal for all AWS certificates.&lt;/p&gt;


&lt;blockquote class="ltag__twitter-tweet"&gt;
      &lt;div class="ltag__twitter-tweet__media"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--hcZpSvY1--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://pbs.twimg.com/media/FgkwntGVQAAjypD.png" alt="unknown tweet media content"&gt;
      &lt;/div&gt;

  &lt;div class="ltag__twitter-tweet__main"&gt;
    &lt;div class="ltag__twitter-tweet__header"&gt;
      &lt;img class="ltag__twitter-tweet__profile-image" src="https://res.cloudinary.com/practicaldev/image/fetch/s--3RiorXQ6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://pbs.twimg.com/profile_images/1315917249123942402/Oe639m6U_normal.jpg" alt="Maciej Radzikowski profile image"&gt;
      &lt;div class="ltag__twitter-tweet__full-name"&gt;
        Maciej Radzikowski
      &lt;/div&gt;
      &lt;div class="ltag__twitter-tweet__username"&gt;
        @radzikowski_m
      &lt;/div&gt;
      &lt;div class="ltag__twitter-tweet__twitter-logo"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ir1kO05j--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-f95605061196010f91e64806688390eb1a4dbc9e913682e043eb8b1e06ca484f.svg" alt="twitter logo"&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag__twitter-tweet__body"&gt;
      This week I passed the AWS Architect Professional exam 🎉&lt;br&gt;&lt;br&gt;Here are my hints on preparing, exam scope, understanding questions, and some pro tips!&lt;br&gt;&lt;br&gt;A thread 🧵👇 
    &lt;/div&gt;
    &lt;div class="ltag__twitter-tweet__date"&gt;
      17:07 PM - 02 Nov 2022
    &lt;/div&gt;


    &lt;div class="ltag__twitter-tweet__actions"&gt;
      &lt;a href="https://twitter.com/intent/tweet?in_reply_to=1587853805961486336" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fFnoeFxk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-reply-action-238fe0a37991706a6880ed13941c3efd6b371e4aefe288fe8e0db85250708bc4.svg" alt="Twitter reply action"&gt;
      &lt;/a&gt;
      &lt;a href="https://twitter.com/intent/retweet?tweet_id=1587853805961486336" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--k6dcrOn8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-retweet-action-632c83532a4e7de573c5c08dbb090ee18b348b13e2793175fea914827bc42046.svg" alt="Twitter retweet action"&gt;
      &lt;/a&gt;
      &lt;a href="https://twitter.com/intent/like?tweet_id=1587853805961486336" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SRQc9lOp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-like-action-1ea89f4b87c7d37465b0eb78d51fcb7fe6c03a089805d7ea014ba71365be5171.svg" alt="Twitter like action"&gt;
      &lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/blockquote&gt;



&lt;blockquote class="ltag__twitter-tweet"&gt;

  &lt;div class="ltag__twitter-tweet__main"&gt;
    &lt;div class="ltag__twitter-tweet__header"&gt;
      &lt;img class="ltag__twitter-tweet__profile-image" src="https://res.cloudinary.com/practicaldev/image/fetch/s--3RiorXQ6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://pbs.twimg.com/profile_images/1315917249123942402/Oe639m6U_normal.jpg" alt="Maciej Radzikowski profile image"&gt;
      &lt;div class="ltag__twitter-tweet__full-name"&gt;
        Maciej Radzikowski
      &lt;/div&gt;
      &lt;div class="ltag__twitter-tweet__username"&gt;
        @radzikowski_m
      &lt;/div&gt;
      &lt;div class="ltag__twitter-tweet__twitter-logo"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ir1kO05j--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-f95605061196010f91e64806688390eb1a4dbc9e913682e043eb8b1e06ca484f.svg" alt="twitter logo"&gt;
      &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="ltag__twitter-tweet__body"&gt;
      This week I passed the AWS DevOps Professional exam 🎉&lt;br&gt;&lt;br&gt;Here is a little guide - scope, what to pay attention to, and how I prepared for it in less than 2 weeks.&lt;br&gt;&lt;br&gt;A thread 🧵👇
    &lt;/div&gt;
    &lt;div class="ltag__twitter-tweet__date"&gt;
      17:00 PM - 13 Jan 2022
    &lt;/div&gt;


    &lt;div class="ltag__twitter-tweet__actions"&gt;
      &lt;a href="https://twitter.com/intent/tweet?in_reply_to=1481672401565876225" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fFnoeFxk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-reply-action-238fe0a37991706a6880ed13941c3efd6b371e4aefe288fe8e0db85250708bc4.svg" alt="Twitter reply action"&gt;
      &lt;/a&gt;
      &lt;a href="https://twitter.com/intent/retweet?tweet_id=1481672401565876225" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--k6dcrOn8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-retweet-action-632c83532a4e7de573c5c08dbb090ee18b348b13e2793175fea914827bc42046.svg" alt="Twitter retweet action"&gt;
      &lt;/a&gt;
      &lt;a href="https://twitter.com/intent/like?tweet_id=1481672401565876225" class="ltag__twitter-tweet__actions__button"&gt;
        &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SRQc9lOp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/twitter-like-action-1ea89f4b87c7d37465b0eb78d51fcb7fe6c03a089805d7ea014ba71365be5171.svg" alt="Twitter like action"&gt;
      &lt;/a&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  What are AWS certificates?
&lt;/h2&gt;

&lt;p&gt;AWS offers 12 certificates. They come in four categories, covering different areas of AWS and varying in difficulty.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--lKr3Mr8X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8etdrlk2u8ssfspr1rhw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--lKr3Mr8X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/8etdrlk2u8ssfspr1rhw.png" alt="All 12 AWS Certificates" width="531" height="746"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Foundational, Associate, and Professional-level certificates build learning paths for Architects and Engineers. They cover a broad spectrum of AWS services and solutions built on AWS. Specialty certificates focus on particular areas and go into much detail on services in scope.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which AWS certificate is for you?
&lt;/h3&gt;

&lt;p&gt;If you are a "tech" person - &lt;strong&gt;don't take the Cloud Practitioner - Foundational exam&lt;/strong&gt;. It is very abstract, requiring just a knowledge of what the cloud is, its advantages, concepts, and the purpose of core services. However, it may be a proper exam for Sales, Marketing, or Agile people from your organization, helping them understand the technology the Engineers are working with.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;There is no required order for taking the exams.&lt;/strong&gt; You don't need Associate certifications to pursue Professional ones.&lt;/p&gt;

&lt;p&gt;Nonetheless, &lt;strong&gt;I suggest starting with the Associate level exam first&lt;/strong&gt;. Which one? The one that matches your expertise and practical experience the most. The scope and expected knowledge for each exam is listed on &lt;a href="https://aws.amazon.com/certification/"&gt;the AWS Certification&lt;/a&gt; pages.&lt;/p&gt;

&lt;p&gt;Only after obtaining one or two Associate certificates, I recommend going for the Professional or Specialty ones, which are considerably more difficult.&lt;/p&gt;

&lt;h3&gt;
  
  
  How much do AWS exams cost?
&lt;/h3&gt;

&lt;p&gt;As you can see in the illustration above, the Foundational-level exam is the cheapest - it costs "only" 100 USD. The Associate-level exams cost 150 USD, and the Professional and Specialty - 300 USD.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;However, you can pay only 50% of it!&lt;/strong&gt; After you pass an exam, you get a voucher with a 50% discount for the next one. So, as long as you prepare well to pass it on the first try, you can pay half the price of all the exams after the first.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do AWS Certificates expire?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;All certificates are valid for three years.&lt;/strong&gt; Then, to keep it active, you must either re-take the exam or achieve a higher-level certification. The connection lines in the above illustration show which certificates prolong which. You can keep the Cloud Practitioner - Foundational certificate active by achieving any Associate-level certificate. The DevOps Engineer - Professional extends the validity of both the Developer and SysOps Administrator from the Associate level. And Solution Architect - Associate can be prolonged by passing the Solution Architect - Professional exam.&lt;/p&gt;

&lt;p&gt;Why do certificates expire? Besides the obvious monetary reasons, AWS constantly adds new features and services. Exams are updated over time with refreshed questions to reflect it. After three years, there are always new, better, and more optimal ways to solve particular problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why take AWS exams?
&lt;/h2&gt;

&lt;p&gt;Everyone can have different reasons for getting certified. I will list mine.&lt;/p&gt;

&lt;p&gt;Firstly, for me, &lt;strong&gt;getting certified is a great way to learn&lt;/strong&gt;. The exam scope pushes me to learn about services and features I may not have used and dive deeper into the ones I know. And always at least part of this knowledge comes in handy in my daily work.&lt;/p&gt;

&lt;p&gt;In my opinion, &lt;strong&gt;AWS exams are valuable for being close to real-life problems&lt;/strong&gt;. So after preparing for it, you are left with practical learnings. And it's not something you can say about all technical certificates out there...&lt;/p&gt;

&lt;p&gt;Secondly, &lt;strong&gt;getting certified is promoted by my employer,&lt;/strong&gt; &lt;a href="https://merapar.com/"&gt;&lt;strong&gt;Merapar&lt;/strong&gt;&lt;/a&gt;. The company is an &lt;a href="https://partners.amazonaws.com/partners/0010L00001mauzOQAQ/Merapar"&gt;Advanced Consulting Partner of AWS&lt;/a&gt;, and as such, it requires some number of active AWS certificates among the employees. Achieved certifications across the company are also proof of knowledge and expertise for our customers.&lt;/p&gt;

&lt;p&gt;And finally, even though I discovered it after the fact, being AWS certified gives you access to AWS Certification Lounges at events like AWS re:Invent or AWS Summit. I was at the Summit in London this year, and there were good snacks in the Lounge, so totally worth it!&lt;/p&gt;

&lt;p&gt;Another argument for getting certified is &lt;strong&gt;to boost your professional profile and CV&lt;/strong&gt;, thus opening doors to interviews or promotions. So while it's not my most significant reason, it's certainly valid. And if you can get your current employer to pay for the learning and getting certified, which can considerably help you in case of looking for another company in the next three years - that's a great deal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Are they relatable to real work?
&lt;/h3&gt;

&lt;p&gt;On each exam, you will encounter scenarios and services you don't deal with in your job. But that's because there are so many solutions you can deploy on AWS. To make the certification more tailored, it would need to be more granular, ending with not 12 but 50 different certificates.&lt;/p&gt;

&lt;p&gt;There is probably no architect that, even across a few years, will work with all the scenarios you are tested against for the Solutions Architect - Professional certificate. Nor an engineer that will use all kinds of databases and accompanying services you need to know for the Database - Specialty exam. But that doesn't invalidate the usefulness of the scope that overlaps with your day-to-day work&lt;/p&gt;

&lt;h2&gt;
  
  
  How to learn for AWS Certificate?
&lt;/h2&gt;

&lt;p&gt;I always rely on those three parts for learning for AWS Certificate exams:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;practical experience,&lt;/li&gt;
&lt;li&gt;certificate course,&lt;/li&gt;
&lt;li&gt;solving practice tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first one, &lt;strong&gt;practical experience&lt;/strong&gt;, is not to be underestimated. The more hands-on experience you have with the services you are questioned about, the better. While you can learn everything in theory and pass the exam, I strongly advise against it. Getting at least some experience in core services for the given certificate will make learning and taking the exam much easier.&lt;/p&gt;

&lt;p&gt;Many times on the AWS exams, I got a question, wasn't sure about the answer right away, and figured it out based on similar work I did in the past. It's much easier to remember something you did hands-on than the information you only learned in the course.&lt;/p&gt;

&lt;p&gt;That leads us to the next part - courses. I recommend &lt;strong&gt;Udemy courses by&lt;/strong&gt; &lt;a href="https://www.udemy.com/user/stephane-maarek/"&gt;&lt;strong&gt;Stephane Maarek&lt;/strong&gt;&lt;/a&gt;. While I'm generally not a fan of video courses, that's the best, most comprehensive way to go through the exam's scope and get the condensed knowledge I found.&lt;/p&gt;

&lt;p&gt;But don't just watch. Active learning is much more effective!&lt;/p&gt;

&lt;p&gt;Make notes. Draw mind maps. Whatever suits you. And above all - &lt;strong&gt;go to AWS Console and play with the services and concepts you learn&lt;/strong&gt;. If you set up things on your own once or twice, you will be able to recall them much better on the exam and will know what options were possible and what not.&lt;/p&gt;

&lt;p&gt;And finally - &lt;strong&gt;solving practice tests&lt;/strong&gt;. It's a great learning technique - it forces your brain to work on problems and figure out the answer actively. It works even if you don't answer correctly but check the right solution with justification afterward. Also, it will prepare you for the type of questions on the exam.&lt;/p&gt;

&lt;p&gt;Where from take the practice questions? For each certificate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;there are 10 sample questions in PDF linked on the AWS certificate page,&lt;/li&gt;
&lt;li&gt;20 more are in the "Official Practice Question Set" on the AWS Skill Builder, also linked on the certificate page in the resources section,&lt;/li&gt;
&lt;li&gt;few more are in the "Exam Readiness" training on the AWS Skill Builder,&lt;/li&gt;
&lt;li&gt;there are separate Udemy courses with just tests containing from 100 to 400 sample questions, again from Stephane Maarek,&lt;/li&gt;
&lt;li&gt;you can google for more sample questions for the individual exams.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It takes me 2-4 weeks to prepare for each exam. I schedule the exam shortly after I start learning for it - there is no better motivation than a deadline 🙃&lt;/p&gt;

&lt;h2&gt;
  
  
  How to solve AWS exam questions?
&lt;/h2&gt;

&lt;p&gt;Most AWS exam questions are scenario-based. Therefore, you need to know how to read and understand them to solve them.&lt;/p&gt;

&lt;p&gt;My process is as follows:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the question thoroughly.&lt;/li&gt;
&lt;li&gt;Identify key phrases, services, and requirements.&lt;/li&gt;
&lt;li&gt;Identify the objective in question.&lt;/li&gt;
&lt;li&gt;Scan through the answers and eliminate obviously incorrect ones.&lt;/li&gt;
&lt;li&gt;Reread the remaining answers, continue eliminating until only one is left, or choose the best from the rest.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The "objective" is often highlighted in the question, for example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Which solution meets the requirements in the &lt;strong&gt;MOST cost-effective&lt;/strong&gt; manner?"&lt;/li&gt;
&lt;li&gt;"Which combination of steps will meet these requirements with the &lt;strong&gt;LEAST change to the architecture&lt;/strong&gt;?"&lt;/li&gt;
&lt;li&gt;"Which solution meets the requirements with the &lt;strong&gt;LOWEST overall latency&lt;/strong&gt;?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Multiple answers may present a technically correct solution to a given scenario but bring different pros and cons. Thus you need to consider them in terms of the identified objective.&lt;/p&gt;

&lt;p&gt;Always choose some answer. There are no negative points. You can flag a question to go back to it later, but it's better to select an answer right away in case you don't have spare time at the end of the exam.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solving exam question example
&lt;/h3&gt;

&lt;p&gt;Let's try it! From Database - Specialty sample questions:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A company’s ecommerce application stores order transactions in an Amazon RDS for MySQL database. The database has run out of available storage and the application is currently unable to take orders.&lt;/p&gt;

&lt;p&gt;Which action should a database specialist take to resolve the issue in the shortest amount of time?&lt;/p&gt;

&lt;p&gt;A) Add more storage space to the DB instance using the ModifyDBInstance action.&lt;br&gt;&lt;br&gt;
B) Create a new DB instance with more storage space from the latest backup.&lt;br&gt;&lt;br&gt;
C) Change the DB instance status from STORAGE_FULL to AVAILABLE.&lt;br&gt;&lt;br&gt;
D) Configure a read replica with more storage space.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Key phrases and services: &lt;strong&gt;Amazon RDS; storage is full, so writes fail&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Objective: &lt;strong&gt;solve the issue with minimal downtime&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;First scan through answers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;C is incorrect. You can't just "tell the database it isn't full" and expect it to magically work without adding the storage space.&lt;/li&gt;
&lt;li&gt;D is incorrect. Read replica is for distributing reads from the database, while our problem is with writes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That leaves us with A and B. Both are theoretically possible. But the objective is to minimize the downtime, and creating a new DB instance from the backup could take hours, depending on the database size. That means the B is incorrect too. So the answer is A - adding storage with ModifyDBInstance action is the only one left and sounds reasonable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solving multiple-response questions
&lt;/h3&gt;

&lt;p&gt;AWS exams also include multiple-response questions. The question always indicates how many answers you must choose.&lt;/p&gt;

&lt;p&gt;There are two types of multiple-response questions. You are asked to select &lt;strong&gt;a combination of steps&lt;/strong&gt; to achieve the solution or &lt;strong&gt;multiple alternative solutions&lt;/strong&gt;. Check the wording of the question correctly.&lt;/p&gt;

&lt;p&gt;However, in 90% of cases, you are asked to choose a combination of steps. Those questions often contain pairs of answers. For example, if you must select 3 answers from 6, there are usually 3 aspects of the question scenario and 2 answers for each. So identify the pairs and choose the best answer from each of them.&lt;/p&gt;

&lt;p&gt;An example from the DevOps Engineer - Professional exam:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A devops engineer wants to implement a blue/green deployment process for an application on AWS and be able to gradually shift the traffic between the environments. The application runs on Amazon EC2 instances behind an Application Load Balancer. The instances run in an EC2 Auto Scaling group. Data is stored in an Amazon RDS Multi-AZ DB instance. External DNS is provided by Amazon Route 53.&lt;/p&gt;

&lt;p&gt;Which combination of steps will implement the blue/green process? (Select THREE.)&lt;/p&gt;

&lt;p&gt;A) Create a second Auto Scaling group behind the same Application Load Balancer.&lt;br&gt;&lt;br&gt;
B) Create a second Application Load Balancer and Auto Scaling group.&lt;br&gt;&lt;br&gt;
C) Create a second alias record in Route 53 pointing to the new environment and use a failover routing policy between the two records.&lt;br&gt;&lt;br&gt;
D) Create a second alias record in Route 53 pointing to the new environment and use a weighted routing policy between the two records.&lt;br&gt;&lt;br&gt;
E) Configure the new EC2 instances to use the same RDS database instance.&lt;br&gt;&lt;br&gt;
F) Configure the new EC2 instances to use the failover node of the RDS database instance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Key phrases and services: &lt;strong&gt;EC2, Auto Scaling, Application Load Balancer, RDS, Route 53&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Objective: &lt;strong&gt;implement blue/green deployment&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We can see the pairs of answers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A and B refer to the architecture of Auto Scaling groups and Application Load Balancer,&lt;/li&gt;
&lt;li&gt;C and D are about setting up Route 53,&lt;/li&gt;
&lt;li&gt;E and F are about connecting EC2 to RDS.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now we choose one answer from each pair:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;B - we need a second Application Load Balancer to direct the traffic to it from Route 53.&lt;/li&gt;
&lt;li&gt;D - failover routing is for Disaster Recovery, weighted routing will allow us to shift the traffic gradually.&lt;/li&gt;
&lt;li&gt;E - both environments (blue and green) need to work simultaneously on the same RDS instance, and the failover node is, again, for Disaster Recovery.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to take an AWS Certificate exam?
&lt;/h2&gt;

&lt;p&gt;You register and take the exam through one of the testing companies, Pearson VUE or PSI. From January 1, 2023, only Pearson VUE.&lt;/p&gt;

&lt;p&gt;You can take the exam at &lt;strong&gt;a local testing center&lt;/strong&gt; or &lt;strong&gt;online&lt;/strong&gt;. However, if you have a testing center nearby - go there. I took two exams online, and my experience was poor. First, you waste time installing testing software that monitors everything and often requires disabling any antivirus you have. Then you spend more time preparing your room and taking photos of it. And when you are finally ready to start, the software crashes. And then, for the next 30 minutes, you are trying to contact the support and make it work, stressing out about the technical issues on top of the exam itself. And it's not just my experience, but my colleagues as well. So, if you can, &lt;strong&gt;go to the local testing center&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Sadly, you won't get the results right after you finish the exam. You must wait up to 24 hours. Usually, the first thing you get is a notification from &lt;a href="https://credly.com/"&gt;Credly&lt;/a&gt; about a new badge issued to you (if you passed) and, several hours later, an official email from AWS Training and Certification.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;AWS offers certifications that prove your knowledge on several levels and in distinct areas.&lt;/p&gt;

&lt;p&gt;Getting AWS Certified is a good way to deepen your AWS knowledge and prove that knowledge to your (current or future) employer and customers. But it does not replace the hands-on experience. Quite the contrary - it will be much easier to pass an exam having at least some practical experience in the area first.&lt;/p&gt;

&lt;p&gt;It's best to start with an Associate-level certificate. Then continue with Specialty or Professional, depending on your area of expertise.&lt;/p&gt;

&lt;p&gt;While those exams cost from $150 (Associate) to $300 (Professional and Specialty), after each exam you pass, you get a voucher with a 50% discount for the next one. The only trick to always paying 50% price (except for the first one) is not to fail an exam 😉&lt;/p&gt;

&lt;p&gt;AWS exams are not trivial, so you must prepare accordingly. I recommend three learning methods in conjunction: getting practical experience in the area, going through a certificate course, and solving practice tests.&lt;/p&gt;

&lt;p&gt;AWS exam questions are usually scenario-based, and you need to learn how to understand and solve them. This is one of the reasons why taking practice tests is so important.&lt;/p&gt;

&lt;p&gt;And finally, you can take the exam at the local test center or online. If available, I recommend the first option as it's less stressful and, perversely, often less time-consuming.&lt;/p&gt;

&lt;p&gt;Good luck!&lt;/p&gt;

&lt;p&gt;PS. Do you have any tips and tricks for the AWS exams yourself? Please share in the comments!&lt;/p&gt;

</description>
      <category>aws</category>
      <category>certification</category>
    </item>
    <item>
      <title>Running Serverless ML on AWS Lambda</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Mon, 21 Nov 2022 16:18:00 +0000</pubDate>
      <link>https://forem.com/aws-builders/running-serverless-ml-on-aws-lambda-2pbg</link>
      <guid>https://forem.com/aws-builders/running-serverless-ml-on-aws-lambda-2pbg</guid>
      <description>&lt;p&gt;Yes, you can run Machine Learning models on serverless, directly with AWS Lambda. I know because I built and productionized such solutions. It's not complicated, but there are a few things to be aware of. I explain them in this in-depth tutorial, where we build a serverless ML pipeline.&lt;/p&gt;

&lt;p&gt;As always, the link to the complete project on GitHub is at the end of the post.&lt;/p&gt;

&lt;h2&gt;
  
  
  Productionizing ML solutions on AWS
&lt;/h2&gt;

&lt;p&gt;There is a wide variety of advanced ML models available on the internet. You can download and use them with just a few lines of code in a high-level ML library. But there is a gap between running amazing models locally and productionizing their usage.&lt;/p&gt;

&lt;p&gt;Here comes &lt;strong&gt;serverless&lt;/strong&gt;, allowing you to run your models in the cloud as simply as you do ad-hoc jobs locally and build event-driven ML pipelines without managing any infrastructure components. And, of course, all that while paying only for what you actually use, not for some virtual machines waiting idly for work.&lt;/p&gt;

&lt;p&gt;But there must be some troubles along the way. Otherwise, I would not have to write this post.&lt;/p&gt;

&lt;p&gt;The number one problem we face with running ML models is &lt;strong&gt;the size of the dependencies&lt;/strong&gt;. Both ML models and libraries are huge. The other thing to consider is &lt;strong&gt;latency&lt;/strong&gt; - loading the model into memory takes time. But we can tackle both those issues.&lt;/p&gt;

&lt;p&gt;So let's build a serverless Machine Learning pipeline. We just need some use-case. How about &lt;strong&gt;automatically generating captions for uploaded images&lt;/strong&gt;?&lt;/p&gt;

&lt;h2&gt;
  
  
  Serverless ML pipeline architecture
&lt;/h2&gt;

&lt;p&gt;Our objective is simple: &lt;strong&gt;generate a caption for each uploaded image&lt;/strong&gt;. That's complex enough to make it a real-life example while keeping the tutorial concise.&lt;/p&gt;

&lt;p&gt;The starting point of our pipeline is an S3 bucket. When we upload images to it, the bucket will send notifications about new objects to the Lambda function. There we do the Machine Learning magic and save the generated caption in the DynamoDB table.&lt;/p&gt;

&lt;p&gt;We will define the infrastructure with AWS CDK, &lt;a href="https://betterdev.blog/aws-cdk-pros-and-cons/"&gt;my preferred Infrastructure as a Code tool&lt;/a&gt;. Once we declare the infrastructure, the CDK will use the CloudFormation to deploy the needed resources.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--R3Ys_MiP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qcfrzn4eg9r9jesedalm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--R3Ys_MiP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qcfrzn4eg9r9jesedalm.png" alt="Serverless Machine Learning pipeline for auto-generating image captions" width="581" height="361"&gt;&lt;/a&gt;&lt;/p&gt;
Serverless Machine Learning pipeline for auto-generating image captions



&lt;h3&gt;
  
  
  Python, the obvious choice
&lt;/h3&gt;

&lt;p&gt;For Machine Learning, Python is the default language of choice. So this is also our choice for the Lambda function.&lt;/p&gt;

&lt;p&gt;AWS CDK also supports Python, so it makes perfect sense to use it to define the infrastructure and keep the project uniform.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not SageMaker Serverless?
&lt;/h3&gt;

&lt;p&gt;Amazon SageMaker is a Swiss Army knife for Machine Learning. But I don't mean the handy pocket version. Rather something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--sDItapAQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jr7r570w57ymatwaycox.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--sDItapAQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jr7r570w57ymatwaycox.png" alt="Wenger 16999 Swiss Army Knife Giant, with 87 tools included, looks a lot like Amazon SageMaker" width="880" height="621"&gt;&lt;/a&gt;&lt;/p&gt;
Wenger 16999 Swiss Army Knife Giant, with 87 tools included, looks a lot like Amazon SageMaker



&lt;p&gt;With &lt;a href="https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html"&gt;SageMaker Serverless Inference&lt;/a&gt;, you can deploy and use an ML model, paying only for the actual usage. However, it runs on AWS Lambda under the hood, bringing the same limitations - like the lack of the GPU.&lt;/p&gt;

&lt;p&gt;At the same time, introducing SageMaker adds complexity to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the architecture - adding another service to our pipeline and calling it from the Lambda function,&lt;/li&gt;
&lt;li&gt;the deployment process - preparing and deploying the model to SageMaker,&lt;/li&gt;
&lt;li&gt;and the code - using the SageMaker library.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While the advantages of using SageMaker libraries and tooling could be significant in some scenarios, we intend to use pre-trained models and high-level libraries that will be entirely sufficient on their own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Generating image captions with Machine Learning
&lt;/h2&gt;

&lt;p&gt;Let's start with the Lambda function code that will be the heart of our pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python libraries
&lt;/h3&gt;

&lt;p&gt;We will use an existing model from &lt;a href="https://huggingface.co/"&gt;🤗 Hugging Face&lt;/a&gt;. It's a platform containing pre-trained ML models for various use cases. It also provides &lt;a href="https://huggingface.co/docs/transformers/index"&gt;🤗 Transformers&lt;/a&gt; - a high-level ML library making using those models dead simple. So simple that even I can use it.&lt;/p&gt;

&lt;p&gt;For dependency management, we will use &lt;a href="https://python-poetry.org/"&gt;Poetry&lt;/a&gt;. Why not pip? Because &lt;a href="https://betterprogramming.pub/5-reasons-why-poetry-beats-pip-python-setup-6f6bd3488a04"&gt;Poetry is better in every aspect&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So we use it to install the libraries we need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;poetry add boto3 transformers[torch] pillow
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://boto3.amazonaws.com/v1/documentation/api/latest/index.html"&gt;boto3&lt;/a&gt; is an AWS SDK for Python. We will need it to communicate with S3 and DynamoDB from the Lambda function.&lt;/p&gt;

&lt;p&gt;Then we add the 🤗 Hugging Face &lt;code&gt;transformers&lt;/code&gt; library mentioned above, specifying it should also install the &lt;a href="https://pytorch.org/"&gt;PyTorch&lt;/a&gt; ML framework it will use under the hood.&lt;/p&gt;

&lt;p&gt;And finally, we need the &lt;a href="https://pillow.readthedocs.io/en/stable/"&gt;Pillow&lt;/a&gt; library for image processing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pre-trained model
&lt;/h3&gt;

&lt;p&gt;As I mentioned, we will use an existing, pre-trained model that does exactly what we need: &lt;a href="https://huggingface.co/nlpconnect/vit-gpt2-image-captioning"&gt;nlpconnect/vit-gpt2-image-captioning&lt;/a&gt; from 🤗 Hugging Face. We just need to download it.&lt;/p&gt;

&lt;p&gt;Because the pre-trained model is large, around 1 GB, we need a &lt;a href="https://git-lfs.github.com/"&gt;Git LFS&lt;/a&gt; extension installed to download it. Then we run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git lfs &lt;span class="nb"&gt;install
&lt;/span&gt;git clone https://huggingface.co/nlpconnect/vit-gpt2-image-captioning
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Lambda code
&lt;/h3&gt;

&lt;p&gt;The Lambda code is just 51 lines (and I put blank lines generously!).&lt;/p&gt;

&lt;p&gt;&lt;code&gt;captioning_lambda/main.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;io&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BytesIO&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VisionEncoderDecoderModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ViTFeatureExtractor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;

&lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"s3"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;dynamodb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"dynamodb"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;captions_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dynamodb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"TABLE_NAME"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;VisionEncoderDecoderModel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"./vit-gpt2-image-captioning"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;feature_extractor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ViTFeatureExtractor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"./vit-gpt2-image-captioning"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"./vit-gpt2-image-captioning"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Records&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Records&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;

    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;load_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;caption&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_caption&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;persist_caption&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;file_byte_string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s"&gt;"Body"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BytesIO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_byte_string&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;"RGB"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;convert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"RGB"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_caption&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;pixel_values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;feature_extractor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;images&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;return_tensors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"pt"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;pixel_values&lt;/span&gt;

    &lt;span class="n"&gt;output_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pixel_values&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_beams&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skip_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;persist_caption&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;captions_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;put_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"caption"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;caption&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Firstly, in lines &lt;code&gt;8-11&lt;/code&gt;, we create &lt;code&gt;boto3&lt;/code&gt; clients to interact with S3 and DynamoDB. For DynamoDB, we need the table name we will pass to the Lambda function as an environment variable.&lt;/p&gt;

&lt;p&gt;Then we load the previously downloaded ML models (lines &lt;code&gt;13-15&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;We do both those things &lt;strong&gt;outside the handler method&lt;/strong&gt;. Therefore this code will be executed only once, &lt;a href="https://betterdev.blog/aws-lambda-performance-optimization/#one-time_initialization"&gt;when the Lambda environment is created&lt;/a&gt;, not on every Lambda execution. This is critical, as loading models takes quite a long. We will look at it in more detail a bit later.&lt;/p&gt;

&lt;p&gt;Further, we have the handler method. The handler is called on Lambda invocation. The event we receive contains the details on the newly created S3 object, and we extract the bucket name and the object key from it.&lt;/p&gt;

&lt;p&gt;Then, we do three simple steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch the image from the S3 bucket.&lt;/li&gt;
&lt;li&gt;Use the previously loaded ML models to understand the image content and generate a caption.&lt;/li&gt;
&lt;li&gt;Save the caption in the DynamoDB table.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Overcoming Lambda size limitations
&lt;/h2&gt;

&lt;p&gt;If we package our code right now, with libraries and model, and upload it to Lambda with a Python environment, we will get an error. The package size limit is 250 MB. Our package is... around 3 GB.&lt;/p&gt;

&lt;p&gt;The 250 MB limit includes &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html"&gt;Lambda layers&lt;/a&gt;, so they are not a solution here.&lt;/p&gt;

&lt;p&gt;So what is the solution? &lt;strong&gt;Bundling it as a Docker image instead.&lt;/strong&gt; The Docker image size limit for Lambda is 10 GB.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda Docker image
&lt;/h3&gt;

&lt;p&gt;We will use a &lt;a href="https://docs.docker.com/build/building/multi-stage/"&gt;multi-stage build&lt;/a&gt; for the Docker image to omit build dependencies in our target image.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;captioning_lambda/Dockerfile&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;public.ecr.aws/docker/library/python:3.9.15-slim-bullseye&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;build&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /root&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    curl &lt;span class="se"&gt;\
&lt;/span&gt;    git &lt;span class="se"&gt;\
&lt;/span&gt;    git-lfs

&lt;span class="k"&gt;RUN &lt;/span&gt;curl &lt;span class="nt"&gt;-sSL&lt;/span&gt; https://install.python-poetry.org | python3 -
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PATH="/root/.local/bin:$PATH"&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;git lfs &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;git clone &lt;span class="nt"&gt;--depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 https://huggingface.co/nlpconnect/vit-gpt2-image-captioning
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; vit-gpt2-image-captioning/.git

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; pyproject.toml poetry.lock ./&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;poetry &lt;span class="nb"&gt;export&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; requirements.txt &lt;span class="nt"&gt;--output&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;########################################&lt;/span&gt;

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; public.ecr.aws/lambda/python:3.9.2022.10.26.12&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=build /root/vit-gpt2-image-captioning ./vit-gpt2-image-captioning&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=build /root/requirements.txt ./&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip3 &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt &lt;span class="nt"&gt;--target&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LAMBDA_RUNTIME_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; main.py ./&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["main.handler"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the first stage, &lt;code&gt;build&lt;/code&gt;, we install curl, Git, Git LFS, and Poetry.&lt;/p&gt;

&lt;p&gt;Then we download the &lt;code&gt;nlpconnect/vit-gpt2-image-captioning&lt;/code&gt; model from the 🤗 Hugging Face, just as we previously did it locally. Finally, we use Poetry to generate a &lt;code&gt;requirements.txt&lt;/code&gt; file with our production Python dependencies.&lt;/p&gt;

&lt;p&gt;Then we use the official Docker image for Lambda with Python. Firstly, we copy the ML model fetched in the build stage and the &lt;code&gt;requirements.txt&lt;/code&gt; file. Next, we install Python dependencies with pip and copy the sources of our Lambda function - the Python code we wrote above. Finally, we instruct that our Lambda handler is the handler method from the &lt;code&gt;main.py&lt;/code&gt; file.&lt;/p&gt;

&lt;h3&gt;
  
  
  Importance of Dockerfile commands order
&lt;/h3&gt;

&lt;p&gt;The order of operations in our &lt;code&gt;Dockerfile&lt;/code&gt; is essential. Each command creates &lt;a href="https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#leverage-build-cache"&gt;a cacheable layer&lt;/a&gt;. But if one layer is changed, all the next are rebuilt. That's why &lt;strong&gt;we want to have the largest and least frequently changed layers first and the ones changed more often last&lt;/strong&gt;. So in our image, we have, in order:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ML model&lt;/li&gt;
&lt;li&gt;Python libraries&lt;/li&gt;
&lt;li&gt;Lambda code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When we change the Lambda code, only that last layer is updated. That means no time-consuming operations like fetching ML model or Python libraries happens on every code update. Also, when making changes and deploying the Docker image, only that last, thin layer will be uploaded every time, not the full 3 GB image.&lt;/p&gt;

&lt;h2&gt;
  
  
  Provisioning ML pipeline with CDK
&lt;/h2&gt;

&lt;p&gt;Now we need to provision AWS infrastructure. It's a simple CDK Stack with three constructs - the DynamoDB table, Lambda function, and S3 bucket.&lt;/p&gt;

&lt;p&gt;Setting up the CDK project from scratch is out of the scope of this tutorial, but you can find the complete source in the GitHub project repository at the end of the post. Here is the essential - MLStack that contains our resources.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;cdk/ml_stack.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;aws_cdk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Stack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RemovalPolicy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Duration&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;aws_cdk.aws_dynamodb&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BillingMode&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Attribute&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AttributeType&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;aws_cdk.aws_lambda&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DockerImageFunction&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DockerImageCode&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;aws_cdk.aws_logs&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RetentionDays&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;aws_cdk.aws_s3&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EventType&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;aws_cdk.aws_s3_notifications&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LambdaDestination&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;constructs&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Construct&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MLStack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Stack&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Construct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;construct_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;construct_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;captions_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CaptionsTable"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;removal_policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;RemovalPolicy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DESTROY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;billing_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BillingMode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PAY_PER_REQUEST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;partition_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AttributeType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;STRING&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;captioning_lambda&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DockerImageFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"CaptioningLambda"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DockerImageCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_image_asset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"./captioning_lambda/"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;memory_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;minutes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;log_retention&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;RetentionDays&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ONE_MONTH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;environment&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="s"&gt;"TABLE_NAME"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;captions_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;captions_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grant_write_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;captioning_lambda&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;images_bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ImagesBucket"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;removal_policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;RemovalPolicy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DESTROY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;auto_delete_objects&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;images_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_event_notification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EventType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OBJECT_CREATED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;LambdaDestination&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;captioning_lambda&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;images_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grant_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;captioning_lambda&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The DynamoDB table definition is pretty simple. It's just a table with on-demand billing.&lt;/p&gt;

&lt;p&gt;The S3 bucket is not complicated as well. We add an event notification rule to it to invoke Lambda for every new object created in the bucket.&lt;/p&gt;

&lt;p&gt;We also add proper permissions to Lambda to access the DynamoDB table and S3 bucket.&lt;/p&gt;

&lt;p&gt;For the Lambda function, we use the &lt;code&gt;DockerImageFunction&lt;/code&gt; construct. We point to the &lt;code&gt;Dockerfile&lt;/code&gt; location for the source code, and the CDK will handle building the Docker image.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adjusting Lambda memory and CPU for Machine Learning
&lt;/h3&gt;

&lt;p&gt;ML libraries require a lot of memory, partially because they need to load huge ML models. Here we set the maximum possible memory size for the Lambda - 10 GB. However, our model does not require this much - 5 GB would be enough.&lt;/p&gt;

&lt;p&gt;But the amount of allocated memory translates to the allocated CPU power. That's why adding more memory is the first step for &lt;a href="https://betterdev.blog/aws-lambda-performance-optimization/#increase_memory"&gt;optimizing Lambda functions&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;ML operations are compute-expensive. In Lambda, without GPU, everything is done on the CPU. &lt;strong&gt;Therefore, the more CPU power is available, the faster our function will work.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On the other hand, remember that increasing the memory allocation also increases Lambda invocation cost. So with heavy usage, it's worth finding the best balance between the execution speed and costs. I detailed how to do this in my &lt;a href="https://betterdev.blog/aws-lambda-performance-optimization/"&gt;Lambda performance optimization&lt;/a&gt; post.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploying the CDK stack
&lt;/h3&gt;

&lt;p&gt;With &lt;a href="https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html#getting_started_install"&gt;CDK CLI installed&lt;/a&gt;, the deployment is as simple as running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cdk deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When we run the deployment for the first time, the CDK will build and upload our Lambda function Docker image. It can take a couple of minutes, requiring fetching gigabytes from the internet and then uploading it to AWS. But consecutive deployments, if we modify only the Lambda code, will be much faster thanks to the image layers cache described before.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing serverless image captions generation
&lt;/h2&gt;

&lt;p&gt;After uploading several images to the S3 bucket and checking the DynamoDB table after a moment, we see the automatically generated captions:&lt;/p&gt;

&lt;p&gt;Quite good!&lt;/p&gt;

&lt;h2&gt;
  
  
  Minimizing serverless ML pipeline latency
&lt;/h2&gt;

&lt;p&gt;Now, let's look at the latency.&lt;/p&gt;

&lt;p&gt;During the first Lambda function invocation, it goes through a &lt;a href="https://betterdev.blog/aws-lambda-performance-optimization/#cold_starts"&gt;cold start&lt;/a&gt;. First, AWS fetches the Docker image, provisions the environment, and executes everything we have outside the handler method. Only then is Lambda ready to handle the event.&lt;/p&gt;

&lt;p&gt;The same happens after every Lambda update or after it's inactive (not called) for some time.&lt;/p&gt;

&lt;p&gt;Here is a sample invocation with cold start traced with AWS X-Ray:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--lXMzbNKe--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/97oxu8h8br6f8vh9t19y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--lXMzbNKe--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/97oxu8h8br6f8vh9t19y.png" alt="Lambda invocation timeline with cold start" width="880" height="593"&gt;&lt;/a&gt;&lt;/p&gt;
Lambda invocation timeline with cold start



&lt;p&gt;Cold start - highlighted &lt;code&gt;Initialization&lt;/code&gt; segment - took 11.4s. From the additional logs, I know that 10s of it was loading the ML models (lines &lt;code&gt;13-15&lt;/code&gt; of the Lambda code).&lt;/p&gt;

&lt;p&gt;Then, the longest part of the invocation was generating the caption, which took almost 2s.&lt;/p&gt;

&lt;p&gt;On the consecutive runs, there is no cold start (&lt;code&gt;Initialization&lt;/code&gt;) part, so the entire execution ends in less than 3s.&lt;/p&gt;

&lt;p&gt;Here are the measured times for other cold start invocations:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;initialization total&lt;/th&gt;
&lt;th&gt;loading models&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;11.4s&lt;/td&gt;
&lt;td&gt;10.0s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;52.3s&lt;/td&gt;
&lt;td&gt;17.6s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;16.0s&lt;/td&gt;
&lt;td&gt;14.6s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;17.6s&lt;/td&gt;
&lt;td&gt;16.3s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2m27s&lt;/td&gt;
&lt;td&gt;2m15s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10.7s&lt;/td&gt;
&lt;td&gt;9.5s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11.3&lt;/td&gt;
&lt;td&gt;10.0s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9.5s&lt;/td&gt;
&lt;td&gt;8.3s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As you can see, while the cold start was usually under 20 seconds, occasionally, it took much longer. Even more than 2 minutes. This variety is something I observe with large Docker images and CPU-intensive initializations, typical for ML workloads. But it's due to AWS internals and is not something we can improve ourselves.&lt;/p&gt;

&lt;h3&gt;
  
  
  Optimizing cold starts
&lt;/h3&gt;

&lt;p&gt;Contrary to popular belief, cold starts are not so big of a problem. They happen rarely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;on the initial invocation,&lt;/li&gt;
&lt;li&gt;when the invocations count increases and Lambda scales up to accommodate for it,&lt;/li&gt;
&lt;li&gt;after the function was not invoked for some time and AWS freed allocated resources.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On production systems, cold starts can affect less than 0.1% of invocations.&lt;/p&gt;

&lt;p&gt;But if you need, what can you do about them?&lt;/p&gt;

&lt;h4&gt;
  
  
  Storing files on EFS
&lt;/h4&gt;

&lt;p&gt;One option I tried in the past is using &lt;a href="https://docs.aws.amazon.com/efs/latest/ug/whatisefs.html"&gt;EFS&lt;/a&gt;. It's a file system that you can attach to the Lambda function. By putting large Python libraries and ML models there, you no longer deal with a large deployment package. Therefore you can use the native Python Lambda runtime instead of the Docker image. And smaller bundle and native Lambda runtime give lower cold start latency.&lt;/p&gt;

&lt;p&gt;But in our case, most of the cold start time is not a result of fetching large Docker image but loading the ML model into memory. And EFS does not solve this part.&lt;/p&gt;

&lt;p&gt;Instead, EFS introduces several complications. First, you need an EC2 instance to put the files on EFS. There is no easy way to deploy files to EFS during the deployment. And additionally, you must put your Lambda in a VPC to attach the EFS.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Altogether, the drawbacks of using EFS in this scenario heavily outweigh the benefits.&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Initializing ahead of time with provisioned concurrency
&lt;/h4&gt;

&lt;p&gt;The more comprehensive solution for Lambda cold starts is to use &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/provisioned-concurrency.html"&gt;provisioned concurrency&lt;/a&gt;. It's basically like asking AWS to keep the given number of function runtimes active for us.&lt;/p&gt;

&lt;p&gt;Provisioned function initialization happens during the deployment. As a result, after the deployment completes, our Lambda function is ready to handle events - without a cold start.&lt;/p&gt;

&lt;p&gt;However, provisioned concurrency incurs costs for the number of runtimes we require active, regardless of whether the Lambda is invoked. Also, keep in mind that if there are more events in the queue than already provisioned functions can handle at once, Lambda will create new environments - with cold starts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Provisioned concurrency is a good solution for client-facing ML Lambdas, where we cannot allow the 20 secs cold start.&lt;/strong&gt; Nonetheless, for asynchronous processes, like in our case, it's most often a waste of money.&lt;/p&gt;

&lt;p&gt;To setup provisioned concurrency, you need to create a Lambda alias and set the number of provisioned concurrent executions you want it to set up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;captioning_lambda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_alias&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"live"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;provisioned_concurrent_executions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Reducing concurrent cold starts
&lt;/h3&gt;

&lt;p&gt;If we upload five images to the S3 bucket at once, then, due to a long cold start, we will see five separate Lambda runtimes created, each to handle one invocation. Of course, assuming there are no "hot" runtimes of our Lambda function already existing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7M1zFviW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7fii9s8gg3wzlzyogst8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7M1zFviW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7fii9s8gg3wzlzyogst8.png" alt="Lambda runtimes and cold starts" width="461" height="229"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That may or may not be what we want. But with such a disproportionally huge cold start to the invocation length, having just a single runtime provisioned and handling all the invocations would not take much longer. And &lt;a href="https://bitesizedserverless.com/bite/when-is-the-lambda-init-phase-free-and-when-is-it-billed/"&gt;we pay for the initialization time of the Docker-based runtimes&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So sometimes, if we know a limited number of runtimes is enough and our workloads come in batches (like by uploading multiple images to our bucket simultaneously), it may be wise to limit the number of runtimes Lambda can create. We do this by setting the &lt;a href="https://docs.aws.amazon.com/lambda/latest/operatorguide/reserved-concurrency.html"&gt;reserved concurrency&lt;/a&gt; for Lambda. In CDK, it's the reserved_concurrent_executions property of the Lambda function construct.&lt;/p&gt;

&lt;p&gt;For example, with reserved concurrency set to 1, Lambda will create only a single runtime environment. All events will be queued on the Lambda internal queue and executed one by one.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--kKPRQdWg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0lx9d4qduwyhkgms5gpc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--kKPRQdWg--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/0lx9d4qduwyhkgms5gpc.png" alt="Lambda runtime with multiple invocations" width="461" height="129"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Of course, we need to ensure that the workload won't be bigger than the throughput of our Lambda. If we keep uploading more images than the set number of runtimes can process, the queue will continue to grow. Eventually, &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/invocation-async.html"&gt;events that won't be processed in 6 hours (by default) will get dropped&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary and lessons learned
&lt;/h2&gt;

&lt;p&gt;Well, that is quite a long post. But I wanted it to reflect real-life objectives and considerations, so I went into detail. I hope it will be helpful for you.&lt;/p&gt;

&lt;p&gt;Let's recap.&lt;/p&gt;

&lt;p&gt;Nowadays, &lt;strong&gt;we can fetch pre-trained Machine Learning models for various cases from the internet and use them in a few lines of code&lt;/strong&gt;. That's great. However, if you have more ML experience, using smaller, more specialized libraries instead of high-level ones could benefit both initial deployment times and cold starts. However, libraries' size is not the biggest problem, so that's optional.&lt;/p&gt;

&lt;p&gt;No matter what Python libraries we choose, ML models are usually at least 1 GB in size anyway. That means &lt;strong&gt;we need to use Docker image instead of native Lambda runtimes&lt;/strong&gt;. But that's okay. If we &lt;strong&gt;order commands in the Dockerfile correctly&lt;/strong&gt;, only the first image build and upload will take a long time. Consecutive ones will take seconds, as we will only update the last image layer containing our code.&lt;/p&gt;

&lt;p&gt;The ML Lambda functions require &lt;strong&gt;a generous amount of memory assigned&lt;/strong&gt; for two reasons. Firstly, ML libraries load models into memory, so too low memory will result in out-of-memory errors. Secondly, Lambda scales the CPU with the memory, and ML operations are CPU-intensive. Therefore, the more CPU power there is the lower latency of our function.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Initializing ML libraries with models increases the cold start&lt;/strong&gt;, which can take a significant amount of time. &lt;strong&gt;If the ML pipeline is an asynchronous process, that's probably not an issue.&lt;/strong&gt; Cold starts happen only from time to time. But &lt;strong&gt;it may be worth paying extra for the Lambda provisioned concurrency in a client-facing ML Lambdas&lt;/strong&gt;, where we cannot allow such long cold starts.&lt;/p&gt;

&lt;p&gt;And finally, due to the long cold starts, &lt;strong&gt;limiting the number of Lambda runtimes created with reserved concurrency settings may be good if our workloads come in batches&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;With both &lt;strong&gt;provisioned concurrency&lt;/strong&gt; and &lt;strong&gt;reserved concurrency&lt;/strong&gt;, we should pay extra attention to proper monitoring.&lt;/p&gt;

&lt;p&gt;You can find the complete source for this project on GitHub: &lt;a href="https://betterdev.blog/serverless-ml-on-aws-lambda/"&gt;aws-lambda-ml&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Least deployment privilege with CDK Bootstrap</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Mon, 26 Sep 2022 13:00:34 +0000</pubDate>
      <link>https://forem.com/aws-builders/least-deployment-privilege-with-cdk-bootstrap-3ha5</link>
      <guid>https://forem.com/aws-builders/least-deployment-privilege-with-cdk-bootstrap-3ha5</guid>
      <description>&lt;p&gt;Security is not convenient. That’s probably why the CDK, by default, uses &lt;code&gt;AdministratorAccess&lt;/code&gt; Policy to deploy resources. But we can easily change it and increase the security of our AWS account, following the least privilege principle with a minimal additional burden.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dangers of default CDK Bootstrap
&lt;/h2&gt;

&lt;p&gt;To start using the CDK, we must bootstrap our AWS account. Bootstrapping creates the resources required by the CDK on the account.&lt;/p&gt;

&lt;p&gt;If we follow &lt;a href="https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html" rel="noopener noreferrer"&gt;the official docs&lt;/a&gt; for getting started with CDK, the process is as simple as it can be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; aws-cdk

&lt;span class="nv"&gt;$ &lt;/span&gt;cdk bootstrap aws://372507991746/eu-west-1 &lt;span class="se"&gt;\&lt;/span&gt;

 ⏳  Bootstrapping environment aws://372507991746/eu-west-1...
Trusted accounts &lt;span class="k"&gt;for &lt;/span&gt;deployment: &lt;span class="o"&gt;(&lt;/span&gt;none&lt;span class="o"&gt;)&lt;/span&gt;
Trusted accounts &lt;span class="k"&gt;for &lt;/span&gt;lookup: &lt;span class="o"&gt;(&lt;/span&gt;none&lt;span class="o"&gt;)&lt;/span&gt;
Using default execution policy of &lt;span class="s1"&gt;'arn:aws:iam::aws:policy/AdministratorAccess'&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
Pass &lt;span class="s1"&gt;'--cloudformation-execution-policies'&lt;/span&gt; to customize.
CDKToolkit: creating CloudFormation changeset...
 ✅  Environment aws://372507991746/eu-west-1 bootstrapped.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;cdk bootstrap&lt;/code&gt; command creates a CloudFormation Stack named &lt;code&gt;CDKToolkit&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This stack contains 5 IAM Roles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;CloudFormationExecutionRole&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DeploymentActionRole&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;LookupRole&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FilePublishingRole&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ImagePublishingRole&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What are they used for?&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Why leaving them as they are is against the least privilege principle?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
And how can we fix this?&lt;/p&gt;

&lt;p&gt;Read on.&lt;/p&gt;
&lt;h2&gt;
  
  
  IAM Roles created by CDK Bootstrap
&lt;/h2&gt;
&lt;h3&gt;
  
  
  CloudFormationExecutionRole
&lt;/h3&gt;

&lt;p&gt;This is the Role that CloudFormation will assume to deploy our Stacks. CloudFormation will use this Role both when we deploy from our local machine with cdk deploy command and through &lt;a href="https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.pipelines-readme.html" rel="noopener noreferrer"&gt;CDK Pipelines&lt;/a&gt; for CI/CD.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;CloudFormationExecutionRole&lt;/code&gt; must have permissions to list, create, modify and delete all the resources we use in our Stacks. For example, if our Stack contains a Lambda function, CloudFormation must have permission to create it.&lt;/p&gt;

&lt;p&gt;To allow creating any kind of resources with CDK, this Role has &lt;code&gt;arn:aws:iam::aws:policy/AdministratorAccess&lt;/code&gt; Policy assigned by default. That’s right – &lt;strong&gt;it gives full access to our account, allowing to do anything&lt;/strong&gt;. That’s very much against the least privilege principle.&lt;/p&gt;
&lt;h4&gt;
  
  
  Dangers of the &lt;code&gt;AdministratorAccess&lt;/code&gt; Policy
&lt;/h4&gt;

&lt;p&gt;Why is it bad? Don’t we want the CDK to be able to create any resources we need in our Stacks?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We want the CDK to be able to deploy only the resources we use.&lt;/strong&gt; So, for example, if we build an app utilizing just a few serverless services, like Lambda, API Gateway, and DynamoDB, we don’t want the CDK to be able to spin up EC2 machines.&lt;/p&gt;

&lt;p&gt;Suppose our computer or the code repository with automatic deployment through the CI pipeline gets compromised. In that case, the attacker can use the CDK to deploy a CloudFormation stack with a bunch of EC2 machines mining bitcoins.&lt;/p&gt;

&lt;p&gt;Security, like onions and ogres, has layers. &lt;strong&gt;Each layer should prevent the attacker from achieving their goal.&lt;/strong&gt; The fact we have a password on our computer and the code repository is private doesn’t justify leaving the next doors behind them wide open.&lt;/p&gt;

&lt;p&gt;Thankfully, we can improve it. Looking again at the output of the cdk bootstrap command, we can notice this message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Using default execution policy of &lt;span class="s1"&gt;'arn:aws:iam::aws:policy/AdministratorAccess'&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
Pass &lt;span class="s1"&gt;'--cloudformation-execution-policies'&lt;/span&gt; to customize.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stay tuned; we will fix it in a moment. But, first, let’s make sure we know what other IAM Roles created by the CDK do.&lt;/p&gt;

&lt;h3&gt;
  
  
  DeploymentActionRole
&lt;/h3&gt;

&lt;p&gt;The CDK CLI and CDK Pipelines assume this Role to create and manage CloudFormation Stacks and the files in a CDK assets S3 Bucket.&lt;/p&gt;

&lt;p&gt;It also allows passing the &lt;code&gt;CloudFormationExecutionRole&lt;/code&gt; to CloudFormation. Then CloudFormation can use it to create, update and delete resources.&lt;/p&gt;

&lt;p&gt;Moreover, the &lt;code&gt;DeploymentActionRole&lt;/code&gt; allows accessing and managing objects in S3 Buckets on other accounts, which is needed for cross-account deployments.&lt;/p&gt;

&lt;h3&gt;
  
  
  LookupRole
&lt;/h3&gt;

&lt;p&gt;CDK CLI uses the &lt;code&gt;LookupRole&lt;/code&gt; when it needs to get information about the already existing resources that we want to use in our CDK app. Those resources include Route53 Hosted Zones, VPCs, SSM Parameters, and a few others.&lt;/p&gt;

&lt;p&gt;The bad part is that the &lt;code&gt;LookupRole&lt;/code&gt; uses a &lt;code&gt;ReadOnlyAccess&lt;/code&gt; IAM Policy, which gives it access to read everything, not only the resources the CDK can do a lookup for.&lt;/p&gt;

&lt;p&gt;On the bright side, it’s just read-only access, and &lt;code&gt;kms:Decrypt&lt;/code&gt; is explicitly excluded from it through the second Policy attached to the &lt;code&gt;LookupRole&lt;/code&gt;, so it can’t be used to read encrypted data and secrets.&lt;/p&gt;

&lt;h3&gt;
  
  
  FilePublishingRole and ImagePublishingRole
&lt;/h3&gt;

&lt;p&gt;Those two Roles allow CDK uploading and managing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;assets (like Lambda function sources) in the CDK assets bucket,&lt;/li&gt;
&lt;li&gt;container images in the CDK ECR repository.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those assets and images are built from our application code, uploaded by the CDK, and then referenced in the CloudFormation Stacks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limiting the CDK Execution Policy access
&lt;/h2&gt;

&lt;p&gt;After reviewing the IAM Roles created by the CDK bootstrap process, we can see the most problematic is the &lt;code&gt;CloudFormationExecutionRole&lt;/code&gt;. It gives CDK full access to our AWS account, while it should only allow deploying and managing the types of resources we use in our app.&lt;/p&gt;

&lt;p&gt;Let’s fix this.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating own CloudFormation Execution Policy
&lt;/h3&gt;

&lt;p&gt;We start with creating our own IAM Policy. It should allow accessing only the AWS services that we use in our CDK application. But, on the other hand, it needs broad access to those selected services. So we will just give full access to them with an asterisk wildcard (&lt;code&gt;*&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Additionally, we will limit access to only the region where we operate. In this example, it will be &lt;code&gt;eu-west-1&lt;/code&gt;. Some services, like CloudFront, are global, so we list them separately with no region restriction.&lt;/p&gt;

&lt;p&gt;And finally, permissions to the IAM actions. As a service managing access to other AWS components, IAM is critical to security. At the same time, it has over 200 actions. So we select only the ones required for our Stacks to work. We also exclude access to the Roles generated by the CDK and the Policy itself for additional protection.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;cdkCFExecutionPolicy.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"apigateway:*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"cloudwatch:*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"lambda:*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"logs:*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"s3:*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ssm:*"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"StringEquals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"aws:RequestedRegion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"cloudfront:*"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"iam:*Role*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"iam:GetPolicy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"iam:CreatePolicy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"iam:DeletePolicy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"iam:*PolicyVersion*"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"NotResource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::*:role/cdk-*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::*:policy/cdkCFExecutionPolicy"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This policy document should be committed to the project repository, as it will evolve with time.&lt;/p&gt;

&lt;p&gt;Having the JSON file, we need to create the IAM Policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws iam create-policy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-name&lt;/span&gt; cdkCFExecutionPolicy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-document&lt;/span&gt; file://cdkCFExecutionPolicy.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Bootstrapping CDK with custom Execution Policy
&lt;/h3&gt;

&lt;p&gt;Now we need to bootstrap the CDK, providing the created IAM Policy to be used instead of the default &lt;code&gt;AdministratorAccess&lt;/code&gt; one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ACCOUNT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws sts get-caller-identity &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Account"&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
cdk bootstrap aws://&lt;span class="nv"&gt;$ACCOUNT_ID&lt;/span&gt;/eu-west-1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cloudformation-execution-policies&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:iam::&lt;/span&gt;&lt;span class="nv"&gt;$ACCOUNT_ID&lt;/span&gt;&lt;span class="s2"&gt;:policy/cdkCFExecutionPolicy"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And… that’s it. &lt;strong&gt;That’s how easy it is to apply the least privilege principle to CDK deployments.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If we have the CDK already bootstrapped on our account, simply rerunning &lt;code&gt;cdk bootstrap&lt;/code&gt;, this time with our custom execution policy, will update it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Updating the Policy
&lt;/h3&gt;

&lt;p&gt;With time, we add more services to our application. This requires us to extend the &lt;code&gt;cdkCFExecutionPolicy&lt;/code&gt; with access to additional services.&lt;/p&gt;

&lt;p&gt;To do this, firstly, we modify the definition in the &lt;code&gt;cdkCFExecutionPolicy.json&lt;/code&gt;. Then we create a new Policy version and set it as a default one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ACCOUNT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws sts get-caller-identity &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"Account"&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt; text&lt;span class="si"&gt;)&lt;/span&gt;
aws iam create-policy-version &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-arn&lt;/span&gt; arn:aws:iam::&lt;span class="nv"&gt;$ACCOUNT_ID&lt;/span&gt;:policy/cdkCFExecutionPolicy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-document&lt;/span&gt; file://cdkCFExecutionPolicy.json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set-as-default&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From now on, the CDK will be using an updated Policy.&lt;/p&gt;

&lt;p&gt;There is a limit to 5 Policy versions, so we need to delete old versions to make updates. But it’s not difficult. We simply list existing versions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws iam list-policy-versions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-arn&lt;/span&gt; arn:aws:iam::&lt;span class="nv"&gt;$ACCOUNT_ID&lt;/span&gt;:policy/cdkCFExecutionPolicy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then delete the selected old version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws iam delete-policy-version &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-arn&lt;/span&gt; arn:aws:iam::&lt;span class="nv"&gt;$ACCOUNT_ID&lt;/span&gt;:policy/cdkCFExecutionPolicy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--version-id&lt;/span&gt; &amp;lt;VERSION&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Multiple projects
&lt;/h2&gt;

&lt;p&gt;Best practices recommend having only a single project per AWS account. But if we really need to deploy a second CDK project to the same account, here is how to bootstrap it with its own execution policy.&lt;/p&gt;

&lt;p&gt;The first step is to create and deploy the new IAM Policy for the second project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws iam create-policy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-name&lt;/span&gt; cdkCFExecutionPolicy2ndProject &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-document&lt;/span&gt; file://cdkCFExecutionPolicy.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we bootstrap the CDK on the same account, creating a separate set of IAM Roles by adding two flags to the &lt;code&gt;cdk bootstrap&lt;/code&gt; command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cdk bootstrap aws://&lt;span class="nv"&gt;$ACCOUNT_ID&lt;/span&gt;/eu-west-1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cloudformation-execution-policies&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:iam::&lt;/span&gt;&lt;span class="nv"&gt;$ACCOUNT_ID&lt;/span&gt;&lt;span class="s2"&gt;:policy/cdkCFExecutionPolicy2ndProject"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--toolkit-stack-name&lt;/span&gt; CDKToolkitMySecondProject &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--qualifier&lt;/span&gt; 2ndProject
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first one, &lt;code&gt;--toolkit-stack-name&lt;/code&gt;, assures that a separate CDK stack with its own resources will be created. The default Stack name is &lt;code&gt;CDKToolkit&lt;/code&gt;, so we provide a distinct one.&lt;/p&gt;

&lt;p&gt;The second parameter, &lt;code&gt;--qualifier&lt;/code&gt;, is a short string added to many resource names created by the CDK to avoid name collisions. It must be unique for every project.&lt;/p&gt;

&lt;p&gt;And lastly, for the second project to actually use these newly bootstrapped CDK Roles, we need to add the same qualifier to the project’s &lt;code&gt;cdk.json&lt;/code&gt; configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"app"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@aws-cdk/core:bootstrapQualifier"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2ndProject"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;By default, CDK uses the &lt;code&gt;AdministratorAccess&lt;/code&gt; IAM Policy to deploy CloudFormation Stacks. That’s far from the “least privilege” principle.&lt;/p&gt;

&lt;p&gt;Thankfully, we can quickly improve it for better security. First, we create a custom IAM Policy with access to only the services we use in our application. Then we (re)bootstrap the CDK, providing our Policy ARN as an &lt;code&gt;--cloudformation-execution-policies&lt;/code&gt; argument.&lt;/p&gt;

&lt;p&gt;Over time, if we need to grant the CDK access to more services, we just update the IAM Policy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Afterthoughts on the convenience
&lt;/h3&gt;

&lt;p&gt;They say the security is not convenient. In fact, I believe this is why the CDK uses &lt;code&gt;AdministratorAccess&lt;/code&gt; Policy by default – this allows using CDK right away, with just one simple &lt;code&gt;cdk bootstrap&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;It’s good the &lt;code&gt;cdk bootstrap&lt;/code&gt; output warns about using the &lt;code&gt;AdministratorAccess&lt;/code&gt;, but sadly, I suspect it’s ignored in most cases.&lt;/p&gt;

&lt;p&gt;Luckily, creating a custom Policy and maintaining it is straightforward, so the problem can be fixed quickly.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>cdk</category>
    </item>
    <item>
      <title>The AWS CDK, Or Why I Stopped Being a CDK Skeptic</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Thu, 30 Jun 2022 12:54:07 +0000</pubDate>
      <link>https://forem.com/mradzikowski/the-aws-cdk-or-why-i-stopped-being-a-cdk-skeptic-4o2o</link>
      <guid>https://forem.com/mradzikowski/the-aws-cdk-or-why-i-stopped-being-a-cdk-skeptic-4o2o</guid>
      <description>&lt;p&gt;Until recently, I was skeptical about the AWS CDK. I believe in Infrastructure as Code (IaC), but with the "code" being YAML. But after using CDK in real projects, the amount of heavy lifting it does and the vast reduction of a boilerplate code changed my view.&lt;/p&gt;

&lt;p&gt;I'm a long-time user and fan of the &lt;a href="https://www.serverless.com/"&gt;Serverless Framework&lt;/a&gt;, and it was my go-to tool for the IaC on AWS. It provides an abstraction layer on top of the CloudFormation, the AWS infrastructure provisioning service. I thought that with the Serverless Framework, building serverless projects is as straightforward as it can be.&lt;/p&gt;

&lt;p&gt;Then, in July 2019, AWS released the CDK – Cloud Development Kit. Like Serverless Framework, it also uses CloudFormation under the hood. But contrary to the SF, in which the primary way to declare infrastructure is YAML, in CDK, you write code in one of the supported programming languages.&lt;/p&gt;

&lt;p&gt;When others started adopting CDK, I didn't jump on the hype train right away but kept looking from a distance. This changed in the past months.&lt;/p&gt;

&lt;h2&gt;
  
  
  My objection towards CDK
&lt;/h2&gt;

&lt;p&gt;My biggest objection was the core trait of the CDK – declaring infrastructure in a programming language.&lt;/p&gt;

&lt;p&gt;While YAML &lt;a href="https://www.arp242.net/yaml-config.html"&gt;has its problems&lt;/a&gt;, I see it as an elegant and readable solution to define the configuration. And that includes infrastructure configuration.&lt;/p&gt;

&lt;p&gt;On the other hand, it's much easier to make a mess using a programming language, and that's what I was afraid of. When you can use loops, if conditions, and any advanced language features, you can as easily define the infrastructure cleanly and concisely as make it a spaghetti code. And the infrastructure is the last place where I want to investigate step-by-step what the hell is happening through the code flow.&lt;/p&gt;

&lt;p&gt;To sum up – I believe(d) that it's much less likely to make the infrastructure definition unreadable with YAML than using a programming language.&lt;/p&gt;

&lt;h2&gt;
  
  
  The need for CDK
&lt;/h2&gt;

&lt;p&gt;Then I started working on a new project that required the resources to be created based on the configuration files. During the deployment, multiple instances of Lambda functions and other resources had to be generated with different settings depending on the provided configuration.&lt;/p&gt;

&lt;p&gt;Generating resources dynamically during deployment requires some higher-level logic. While you can use JavaScript/TypeScript instead of YAML to define resources in the Serverless Framework, the CDK, with code as the native way to declare infrastructure, seemed like an obvious choice.&lt;/p&gt;

&lt;p&gt;If you already used CloudFormation or any IaC tool based on it (like Serverless Framework or SAM), the learning curve of CDK is flat. In a few weeks, I scaffolded the new project, finding out how much good there was in the CDK. Now, two projects later, the CDK is the default IaC framework for me.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pros of CDK
&lt;/h2&gt;

&lt;p&gt;The two most obvious pros of CDK are that you can utilize all the power of a programming language to define the infrastructure and that it uses CloudFormation under the hood.&lt;/p&gt;

&lt;p&gt;With languages supported by the CDK – TypeScript, JavaScript, Python, Java, C#, and Go – you write the code in a familiar way. No more esoteric CloudFormation logic instructions in JSON or YAML. And, with the code completion in your IDE, no more checking the documentation for the exact name of every single parameter.&lt;/p&gt;

&lt;p&gt;Equally important, CDK "synthesizes" the code into standard CloudFormation stacks and deploys them. Using CloudFormation, the stable and battle-tested (although slow) IaC system to perform the actual infrastructure management, CDK can focus on providing the best possible developer experience on top of it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JJeaOB3X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/hy4m981knco8wc2e79rb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JJeaOB3X--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/hy4m981knco8wc2e79rb.png" alt="CDK synthetizes application to a CloudFormation Template which is then deployed as a CloudFormation Stack" width="841" height="231"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But what really convinced me were things I found only after I started working with the CDK.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reusable custom Constructs
&lt;/h3&gt;

&lt;p&gt;First and foremost, Constructs reusability.&lt;/p&gt;

&lt;p&gt;Constructs are building blocks of the CDK. Like LEGO bricks, you can compose them to make larger, specialized Constructs. Then you can use those Constructs multiple times across the application to create many similar resources without repeating the configuration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--91MAT0te--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3s6okrkge6w9l0shf9vq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--91MAT0te--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/3s6okrkge6w9l0shf9vq.png" alt="Constructs can be reused across the application" width="880" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is really powerful. The first thing I did was to create a custom Construct for a Node.js Lambda function, which included basic CloudWatch Alarms. This way, every function in my application is always properly monitored. Overriding the parameters of this custom Lambda Construct, I can customize alarm thresholds and function configuration where needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  High-level Constructs
&lt;/h3&gt;

&lt;p&gt;CDK comes with 3 "levels" of Constructs.&lt;/p&gt;

&lt;p&gt;Level 1, or L1, are low-level Constructs that correspond directly to the CloudFormation resources. Their names are prefixed with &lt;code&gt;Cfn&lt;/code&gt;, like &lt;code&gt;CfnBucket&lt;/code&gt; or &lt;code&gt;CfnFunction&lt;/code&gt;. With them, you work with exactly the same structures as in raw CloudFormation, with the same parameters and behavior. Nothing more, nothing less.&lt;/p&gt;

&lt;p&gt;L2 Constructs, on the other hand, are smarter and provide higher-level API. They come with sensible defaults and reduce the required boilerplate code to the minimum. They also provide helper functions, for example, to set the IAM permissions.&lt;/p&gt;

&lt;p&gt;L2 Constructs are the core of the CDK. Using them, you apply many best practices for setting up individual resources, like protecting access to the S3 Buckets. And since their API is an abstraction over the CloudFormation properties, it's a lot more concise and readable.&lt;/p&gt;

&lt;p&gt;This snippet creates a secured S3 Bucket and a Lambda function with an IAM Role granting read access to it, which translates to about 80 lines of L1 Constructs code or raw CloudFormation YAML:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;myBucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyBucket&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;encryption&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BucketEncryption&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;S3_MANAGED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;blockPublicAccess&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BlockPublicAccess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;BLOCK_ALL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;objectOwnership&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ObjectOwnership&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;BUCKET_OWNER_ENFORCED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;myLambda&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;NodejsFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyLambda&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;__dirname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;src&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;index.ts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;runtime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;NODEJS_16_X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;myBucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;grantRead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;myLambda&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;L3 Constructs, also called patterns, are even more abstract building blocks that include multiple resources. For example, with a single L3 Construct, you can create a Fargate service running on an ECS cluster behind an Application Load Balancer. Don't make me count how many CloudFormation resources you need to define to set it up yourself…&lt;/p&gt;

&lt;h3&gt;
  
  
  Sensible defaults
&lt;/h3&gt;

&lt;p&gt;I've mentioned it above, but let me emphasize this. L2 and L3 Constructs include sensible defaults that reduce the boilerplate and make the infrastructure safer, more robust, and closer to the best practices.&lt;/p&gt;

&lt;p&gt;Examples include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 Buckets and DynamoDB tables having enabled "Retain" Deletion Policy by default to not loose production data in case of accidental stack removal,&lt;/li&gt;
&lt;li&gt;Node.js Lambda functions &lt;a href="https://betterdev.blog/aws-lambda-performance-optimization/#keep_http_connections_alive"&gt;enabling connections reuse to optimize AWS SDK v2&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Utility functions
&lt;/h3&gt;

&lt;p&gt;Utility functions are a great help in L2 and L3 Constructs.&lt;/p&gt;

&lt;p&gt;The ones you will most often come across are the IAM permissions helpers. Instead of laboriously defining IAM policies, you can glue access between Constructs with &lt;code&gt;grant*()&lt;/code&gt; functions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;myLambda&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;NodejsFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;MyLambda&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;__dirname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;src&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;index.ts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;runtime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;NODEJS_16_X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;myS3Bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;grantRead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;myLambda&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;myDynamoDBTable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;grantReadWriteData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;myLambda&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But IAM helpers are not all. Few other examples include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Function.metricError()&lt;/code&gt; to get a CloudWatch Metric for the Lambda function errors count that you can use to set up a CloudWatch Alarm,&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MappingTemplate.dynamoDbGetItem()&lt;/code&gt; to create an AppSync resolver mapping template to put an item to a DynamoDB Table,&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Arn.format()&lt;/code&gt; to build an ARN from the parts like the region, service name, and the resource name,&lt;/li&gt;
&lt;li&gt;and many more.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Logic above CloudFormation
&lt;/h3&gt;

&lt;p&gt;CloudFormation, used under the hood by the CDK, deals only with the infrastructure and is very strict about it. It takes the current state of the resources and modifies them to achieve the expected state.&lt;/p&gt;

&lt;p&gt;But deployment often includes more than just creating cloud resources. For example, you need to upload your application code, sometimes upload some assets to S3 buckets, or create an SSL certificate for the domain in the &lt;code&gt;us-east-1&lt;/code&gt; region for the CloudFront. Those actions are outside of the CloudFormation scope.&lt;/p&gt;

&lt;p&gt;Thankfully, they are not outside of the CDK scope. For example, the Lambda &lt;code&gt;Function&lt;/code&gt; Construct will bundle and upload your function code. The S3 &lt;code&gt;BucketDeployment&lt;/code&gt; Construct will put your website files in a bucket. And &lt;code&gt;DnsValidatedCertificate&lt;/code&gt; Construct will create an SSL certificate in any region you need.&lt;/p&gt;

&lt;p&gt;Another neat built-in feature is the &lt;code&gt;autoDeleteObjects&lt;/code&gt; parameter of the S3 &lt;code&gt;Bucket&lt;/code&gt; Construct. When set to &lt;code&gt;true&lt;/code&gt;, it will empty the bucket on the stack removal, letting the bucket be deleted. This is perfect for website hosting and short-living feature branch environments, where buckets do not contain valuable data.&lt;/p&gt;

&lt;h4&gt;
  
  
  Advantages over Serverless Framework plugins
&lt;/h4&gt;

&lt;p&gt;You can achieve all of the above in the Serverless Framework with plugins. But the CDK has two advantages here.&lt;/p&gt;

&lt;p&gt;Firstly, it's all built-in. No additional dependencies to install and no problems with unmaintained and outdated plugins.&lt;/p&gt;

&lt;p&gt;Secondly, everything is handled server-side (cloud-side?) by the custom resources triggered by CloudFormation lifecycle hooks. You run the stack deployment, and you don't worry about uninterrupted internet connection during the SSL certificate creation and verification. Even better, you can delete the stack from the AWS Console, and it will trigger the bucket content removal, not only when you delete the stack using the CDK CLI.&lt;/p&gt;

&lt;p&gt;In contrast, Serverless Framework plugins usually perform such actions through the AWS SDK calls from your local machine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multiple stacks support
&lt;/h3&gt;

&lt;p&gt;Most of the projects I work on consist of &lt;a href="https://articles.alfa1.com/serverless-project-structure-beyond-monorepo"&gt;multiple CloudFormation stacks&lt;/a&gt;. In CDK, you create an &lt;code&gt;App&lt;/code&gt;, which may include numerous &lt;code&gt;Stacks&lt;/code&gt;. Then, you can define dependencies between the stacks, and the CDK will take care of deploying them in the correct order.&lt;/p&gt;

&lt;p&gt;The fact that multiple stacks are managed as a single application enables applying settings and making changes to the whole project in one place, without duplication. This includes setting tags or applying &lt;a href="https://docs.aws.amazon.com/cdk/v2/guide/aspects.html"&gt;Aspects&lt;/a&gt; to all the resources in all the stacks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="cm"&gt;/**
 * Set RemovalPolicy DESTROY to LogGroups
 * so they are removed on the stack removal, not kept indefinitely.
 */&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nx"&gt;LogGroupRemovalPolicyAspect&lt;/span&gt; &lt;span class="k"&gt;implements&lt;/span&gt; &lt;span class="nx"&gt;IAspect&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nx"&gt;visit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;IConstruct&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;node&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nx"&gt;CfnLogGroup&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;applyRemovalPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;RemovalPolicy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DESTROY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;cdk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;App&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="nx"&gt;Tags&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;projectName&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mySecretProject&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;Aspects&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;LogGroupRemovalPolicyAspect&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Built-in CI
&lt;/h3&gt;

&lt;p&gt;With CDK, you can create a Continuous Integration pipeline for the project. The pipeline, obviously using AWS CodePipeline, is quite clever. You deploy it once, and if you make and commit any changes to it, it will mutate itself before deploying your stacks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Construct Hub
&lt;/h3&gt;

&lt;p&gt;Constructs' reusability makes them perfect for sharing. &lt;a href="https://constructs.dev/"&gt;Construct Hub&lt;/a&gt; is a catalog of open-source Constructs built by AWS, AWS partners, and the community.&lt;/p&gt;

&lt;p&gt;Among a variety of L3 Constructs, especially those three caught my eye:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://constructs.dev/packages/cdk-iam-floyd/"&gt;cdk-iam-floyd&lt;/a&gt; – IAM Policy generator with a fluent interface&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://constructs.dev/packages/cdk-monitoring-constructs/"&gt;cdk-monitoring-constructs&lt;/a&gt; – CloudWatch Dashboard and Alarms for various AWS services&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://constructs.dev/packages/cdk-spa-deploy/"&gt;CDK-SPA-Deploy&lt;/a&gt; – all you need to have a SPA website running&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Cons of CDK
&lt;/h2&gt;

&lt;p&gt;Obviously, I had to find some drawbacks. Otherwise, I would be even angrier that I only now convinced myself to CDK.&lt;/p&gt;

&lt;h3&gt;
  
  
  One-at-the-time stacks deployment
&lt;/h3&gt;

&lt;p&gt;CDK runs on top of CloudFormation, which is famously slow. In a multi-stack application, you need to deploy stacks in parallel where possible to reduce the overall deployment time.&lt;/p&gt;

&lt;p&gt;Unfortunately, at the moment, this is not possible with the CDK when deploying from your local machine. So you can take a (long) break every time you deploy the whole application. But this will hopefully be resolved soon with &lt;a href="https://github.com/aws/aws-cdk/pull/20345"&gt;a –concurrency option&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  CodePipeline for the CI
&lt;/h3&gt;

&lt;p&gt;The built-in CI, while clever, is using the AWS CodePipeline. And if you have worked with the CodePipeline, you know it's not the best CI out there.&lt;/p&gt;

&lt;p&gt;The biggest issue I encountered is the impossibility of retrying individual failed stage deployments. The CDK splits stack deployments into creating CloudFormation Change Sets and executing them. If executing Change Set fails due to a conflict you then fix, it's impossible to retry the given stage deployment without re-running the whole pipeline. This is because, when retrying the individual stage deployment, CodePipeline runs only the failed actions (the Change Set execution) and does not re-create the Change Sets first.&lt;/p&gt;

&lt;p&gt;Also, if you want to notify your GitHub repository of the pipeline execution result, you need to &lt;a href="https://aws.amazon.com/blogs/devops/aws-codepipeline-build-status-in-a-third-party-git-repository/"&gt;implement the webhook call yourself&lt;/a&gt;. This is yet another example of how great the AWS CodePipeline is.&lt;/p&gt;

&lt;p&gt;Of course, you don't have to use built-in CDK pipelines. Instead, you can script your own deployment on any CI/CD platform you want.&lt;/p&gt;

&lt;h3&gt;
  
  
  Low-value constructs in Construct Hub
&lt;/h3&gt;

&lt;p&gt;I've described the &lt;a href="https://constructs.dev/"&gt;Construct Hub&lt;/a&gt; above as a library of high-level CDK Constructs. At the moment it contains over 1000 packages.&lt;/p&gt;

&lt;p&gt;That sounds great, but many of them are L3 patterns which are, in my opinion, rather low value. For example, &lt;a href="https://constructs.dev/packages/@aws-solutions-constructs/aws-lambda-sqs/"&gt;&lt;code&gt;LambdaToSqs&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://constructs.dev/packages/@aws-solutions-constructs/aws-sqs-lambda/"&gt;&lt;code&gt;SqsToLambda&lt;/code&gt;&lt;/a&gt; Constructs integrate just a Lambda writing or reading from the SQS queue. Maybe it's just me, but it seems a lot like the &lt;a href="https://www.npmjs.com/package/is-even"&gt;is-even&lt;/a&gt; package – the benefits are too small to justify installing a dependency instead of doing it by yourself. But maybe I'm wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;While I still find the Serverless Framework awesome for simpler use cases, I discovered that CDK is the best fit for larger projects. With it, you can reduce the boilerplate to the minimum. Declaring infrastructure is faster with high-level Constructs, code completion, sensible defaults, and utilities. And it's easier to keep high-quality and unified configuration with reusable Constructs.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>cdk</category>
      <category>cloud</category>
      <category>iac</category>
    </item>
    <item>
      <title>Personal backup to Amazon S3 – cheap and easy</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Wed, 16 Mar 2022 11:05:57 +0000</pubDate>
      <link>https://forem.com/aws-builders/personal-backup-to-amazon-s3-cheap-and-easy-3c39</link>
      <guid>https://forem.com/aws-builders/personal-backup-to-amazon-s3-cheap-and-easy-3c39</guid>
      <description>&lt;p&gt;In need to backup my personal files in the cloud, I wrote a script that archives the data into the Amazon S3 bucket. After some fine-tuning and solving a bunch of edge-cases, it's limited mainly by the disk read and my internet upload speed. And it costs me only $3.70 per &lt;a href="https://en.wikipedia.org/wiki/Binary_prefix" rel="noopener noreferrer"&gt;TiB&lt;/a&gt; per month.&lt;/p&gt;

&lt;p&gt;Instead of reinventing the wheel, I started with research. There must be a good, easy-to-use cloud backup service, right? But everything I found was too complex and/or expensive. So I wrote down the backup script myself.&lt;/p&gt;

&lt;p&gt;Then I did the research again, and the results were quite different - this time, I found a few reasonable services I could use. But I already had the script, I had fun writing it, I will continue using it, so I decided to share it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Off-site backup for personal files
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://news.ycombinator.com/item?id=29978099" rel="noopener noreferrer"&gt;question&lt;/a&gt; &lt;a href="https://news.ycombinator.com/item?id=12999934" rel="noopener noreferrer"&gt;about&lt;/a&gt; &lt;a href="https://news.ycombinator.com/item?id=25758675" rel="noopener noreferrer"&gt;the&lt;/a&gt; &lt;a href="https://news.ycombinator.com/item?id=13694079" rel="noopener noreferrer"&gt;personal&lt;/a&gt; &lt;a href="https://news.ycombinator.com/item?id=14329524" rel="noopener noreferrer"&gt;backup&lt;/a&gt; &lt;a href="https://news.ycombinator.com/item?id=1946416" rel="noopener noreferrer"&gt;system&lt;/a&gt; is raised from time to time on Hacker News. On the internet, you can find a &lt;strong&gt;3-2-1 backup strategy&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;3 copies&lt;/li&gt;
&lt;li&gt;on 2 different media&lt;/li&gt;
&lt;li&gt;with at least 1 copy off-site&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I have an external disk with documents and photos archive. This, however, is just one copy kept right next to my laptop. And hard drives fail.&lt;/p&gt;

&lt;p&gt;So I needed an off-site backup.&lt;/p&gt;

&lt;p&gt;Looking for &lt;strong&gt;personal cloud backup&lt;/strong&gt; solutions, I found some overcomplicated, some expensive, and one or two reasonable services. But then I remembered that I work on AWS, and Amazon S3 storage is cheap. This is especially true if you want to archive data and don't touch it too often.&lt;/p&gt;

&lt;p&gt;The result is the script I wrote for backing up files to the S3 bucket.&lt;/p&gt;

&lt;h2&gt;
  
  
  Backup to S3 script
&lt;/h2&gt;

&lt;p&gt;Below is the full, detailed explanation. If you are interested only in the script and usage instructions, you can find the link to the GitHub repository at the end.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;The script uses &lt;a href="https://rclone.org/" rel="noopener noreferrer"&gt;rclone&lt;/a&gt; and &lt;a href="https://www.gnu.org/software/parallel/" rel="noopener noreferrer"&gt;GNU Parallel&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;On macOS, you can install them with Homebrew:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;rclone parallel
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Provision AWS resources
&lt;/h3&gt;

&lt;p&gt;To store backups in an S3 bucket, you need to have such a bucket. And while you could create and configure it by hand, it will be easier to provision it with a simple CloudFormation template.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;stack.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;AWSTemplateFormatVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2010-09-09&lt;/span&gt;  

&lt;span class="na"&gt;Resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  

  &lt;span class="na"&gt;BackupBucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3::Bucket&lt;/span&gt;  
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
      &lt;span class="na"&gt;PublicAccessBlockConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
        &lt;span class="na"&gt;BlockPublicAcls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;  
        &lt;span class="na"&gt;IgnorePublicAcls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;  
        &lt;span class="na"&gt;BlockPublicPolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;  
        &lt;span class="na"&gt;RestrictPublicBuckets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;  
      &lt;span class="na"&gt;OwnershipControls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
        &lt;span class="na"&gt;Rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ObjectOwnership&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;BucketOwnerEnforced&lt;/span&gt;  
      &lt;span class="na"&gt;VersioningConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
        &lt;span class="na"&gt;Status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Enabled&lt;/span&gt;  
      &lt;span class="na"&gt;LifecycleConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
        &lt;span class="na"&gt;Rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AbortIncompleteMultipartUpload&lt;/span&gt;  
            &lt;span class="na"&gt;Status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Enabled&lt;/span&gt;  
            &lt;span class="na"&gt;AbortIncompleteMultipartUpload&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
              &lt;span class="na"&gt;DaysAfterInitiation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;3&lt;/span&gt;  
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NoncurrentVersionExpiration&lt;/span&gt;  
            &lt;span class="na"&gt;Status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Enabled&lt;/span&gt;  
            &lt;span class="na"&gt;NoncurrentVersionExpiration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
              &lt;span class="na"&gt;NewerNoncurrentVersions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;3&lt;/span&gt;  
              &lt;span class="na"&gt;NoncurrentDays&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30&lt;/span&gt;  

  &lt;span class="na"&gt;BackupUser&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::IAM::User&lt;/span&gt;  
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
      &lt;span class="na"&gt;Policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;s3-access&lt;/span&gt;  
          &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
            &lt;span class="na"&gt;Version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2012-10-17"&lt;/span&gt;  
            &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;  
                &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3:*MultipartUpload*'&lt;/span&gt;  
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3:ListBucket'&lt;/span&gt;  
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3:GetObject'&lt;/span&gt;  
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s3:PutObject'&lt;/span&gt;  
                &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${BackupBucket.Arn}'&lt;/span&gt;  
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;${BackupBucket.Arn}/*'&lt;/span&gt;  

&lt;span class="na"&gt;Outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  

  &lt;span class="na"&gt;BackupBucketName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
    &lt;span class="na"&gt;Value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;BackupBucket&lt;/span&gt;  

  &lt;span class="na"&gt;BackupUserName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
    &lt;span class="na"&gt;Value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;BackupUser&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The template defines two resources: the &lt;code&gt;BackupBucket&lt;/code&gt; and &lt;code&gt;BackupUser&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;BackupBucket&lt;/code&gt; has &lt;strong&gt;disabled public access&lt;/strong&gt; for all the objects, as you don't want any of the files to be publicly accessible by mistake.&lt;/p&gt;

&lt;p&gt;It also enables &lt;strong&gt;object versioning&lt;/strong&gt;. When uploading new versions of existing files - fresh backups of the same files - the previous ones will be kept instead of immediately overridden.&lt;/p&gt;

&lt;p&gt;On the other hand, to not keep old backups indefinitely (and pay for them), the bucket has a lifecycle rule that &lt;strong&gt;automatically removes old file versions&lt;/strong&gt;. It will keep only the last 3 versions and remove others 30 days after they become "older".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhnnv7gjw0mo7igkt3128.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhnnv7gjw0mo7igkt3128.png" alt="S3 lifecycle rule description in AWS Console&amp;lt;br&amp;gt;
"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The other lifecycle rule &lt;strong&gt;aborts incomplete files uploads after 3 days&lt;/strong&gt;. The script will upload big files in multiple chunks. If the process fails or is interrupted, you are still charged for the uploaded chunks until you complete or abort the upload. This rule will prevent those incomplete uploads from staying forever and generating charges.&lt;/p&gt;

&lt;p&gt;The second created resource is the &lt;code&gt;BackupUser&lt;/code&gt;. It's an IAM user with permission to upload files to the bucket.&lt;/p&gt;
&lt;h4&gt;
  
  
  Deploy stack
&lt;/h4&gt;

&lt;p&gt;To deploy the stack, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws cloudformation deploy &lt;span class="nt"&gt;--stack-name&lt;/span&gt; backupToS3 &lt;span class="nt"&gt;--template-file&lt;/span&gt; stack.yml &lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_IAM
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Get bucket name
&lt;/h4&gt;

&lt;p&gt;After the deployment is completed, go to the CloudFormation in the AWS Console and find the &lt;code&gt;backupToS3&lt;/code&gt; stack. Then, in the "Outputs" tab, you will see the &lt;code&gt;BackupBucketName&lt;/code&gt; key with the generated S3 bucket name. You will need it in a moment.&lt;/p&gt;

&lt;h4&gt;
  
  
  Get access key
&lt;/h4&gt;

&lt;p&gt;Similarly, you will find the &lt;code&gt;BackupUserName&lt;/code&gt; with the IAM user name. Go to the IAM, open that user details, and create an access key in the "Security credentials" tab.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setup rclone
&lt;/h3&gt;

&lt;p&gt;rclone requires setting up the storage backend upfront. You can do this by running &lt;code&gt;rclone config&lt;/code&gt; and &lt;a href="https://rclone.org/s3/#configuration" rel="noopener noreferrer"&gt;setting up the S3&lt;/a&gt; or manually editing the configuration file.&lt;/p&gt;

&lt;p&gt;In the configuration, set the access key ID and secret access key generated for the IAM user.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;~/.config/rclone/rclone.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[backup]&lt;/span&gt;
&lt;span class="py"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;s3&lt;/span&gt;
&lt;span class="py"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;aws&lt;/span&gt;
&lt;span class="py"&gt;env_auth&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;false&lt;/span&gt;
&lt;span class="py"&gt;access_key_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;xxxxxx&lt;/span&gt;
&lt;span class="py"&gt;secret_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;xxxxxx&lt;/span&gt;
&lt;span class="py"&gt;acl&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;private&lt;/span&gt;
&lt;span class="py"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;eu-west-1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Backup to S3
&lt;/h3&gt;

&lt;p&gt;Generally, the idea is straightforward: we copy everything to the S3 bucket.&lt;/p&gt;

&lt;p&gt;But things are rarely so simple. So let's break it down, step by step.&lt;/p&gt;

&lt;p&gt;The script is based on the &lt;a href="https://betterdev.blog/minimal-safe-bash-script-template/" rel="noopener noreferrer"&gt;minimal Bash script template&lt;/a&gt;. Bash provides the easiest way to glue together various CLI programs and tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-Eeuo&lt;/span&gt; pipefail

usage&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="c"&gt;# omitted for brevity&lt;/span&gt;
  &lt;span class="nb"&gt;exit&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

parse_params&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nv"&gt;split_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
  &lt;span class="nv"&gt;max_size_gb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1024
  &lt;span class="nv"&gt;storage_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"GLACIER"&lt;/span&gt;
  &lt;span class="nv"&gt;dry_run&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false

  &lt;/span&gt;&lt;span class="k"&gt;while&lt;/span&gt; :&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    case&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;1&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt;
    &lt;span class="nt"&gt;-h&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="nt"&gt;--help&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; usage &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="nt"&gt;--verbose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-x&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="nt"&gt;-b&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="nt"&gt;--bucket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;shift&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;backup_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;shift&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="nt"&gt;--path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;root_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;shift&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="nt"&gt;--max-size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;max_size_gb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;shift&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="nt"&gt;--split-depth&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;split_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;shift&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="nt"&gt;--storage-class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;storage_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;shift&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="nt"&gt;--dry-run&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;dry_run&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    -?&lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; die &lt;span class="s2"&gt;"Unknown option: &lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;break&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="k"&gt;esac&lt;/span&gt;
    &lt;span class="nb"&gt;shift
  &lt;/span&gt;&lt;span class="k"&gt;done&lt;/span&gt;

  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; die &lt;span class="s2"&gt;"Missing required parameter: bucket"&lt;/span&gt;
  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;backup_name&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; die &lt;span class="s2"&gt;"Missing required parameter: name"&lt;/span&gt;
  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;root_path&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; die &lt;span class="s2"&gt;"Missing required parameter: path"&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;0
&lt;span class="o"&gt;}&lt;/span&gt;

main&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nv"&gt;root_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;
    &lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;dirname&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;pwd&lt;/span&gt; &lt;span class="nt"&gt;-P&lt;/span&gt;
  &lt;span class="si"&gt;)&lt;/span&gt;/&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;basename&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="c"&gt;# convert to absolute path&lt;/span&gt;

  &lt;span class="c"&gt;# division by 10k gives integer (without fraction), round result up by adding 1&lt;/span&gt;
  &lt;span class="nv"&gt;chunk_size_mb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;max_size_gb &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="m"&gt;10000&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;

  &lt;span class="c"&gt;# common rclone parameters&lt;/span&gt;
  &lt;span class="nv"&gt;rclone_args&lt;/span&gt;&lt;span class="o"&gt;=(&lt;/span&gt;
    &lt;span class="s2"&gt;"-P"&lt;/span&gt;
    &lt;span class="s2"&gt;"--s3-storage-class"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$storage_class&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="s2"&gt;"--s3-upload-concurrency"&lt;/span&gt; 8
    &lt;span class="s2"&gt;"--s3-no-check-bucket"&lt;/span&gt;
  &lt;span class="o"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;backup_file &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;basename&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$split_depth&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 0 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;backup_path &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;basename&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;else
    &lt;/span&gt;traverse_path &lt;span class="nb"&gt;.&lt;/span&gt;
  &lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

parse_params &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script expects at least three parameters: the S3 bucket name (&lt;code&gt;--bucket&lt;/code&gt;), the backup name (&lt;code&gt;--name&lt;/code&gt;), and the local path to be backed up (&lt;code&gt;--path&lt;/code&gt;). The backup name serves as an S3 prefix to separate distinct backups.&lt;/p&gt;

&lt;p&gt;After parsing the input arguments, the script does four things.&lt;/p&gt;

&lt;p&gt;Firstly, it converts the path to absolute.&lt;/p&gt;

&lt;p&gt;Secondly, it calculates the chunk size for the multipart file upload based on the max archive size. More on this later.&lt;/p&gt;

&lt;p&gt;Thirdly, it creates an array of common parameters for rclone.&lt;/p&gt;

&lt;p&gt;And finally, it executes backup based on the provided arguments.&lt;/p&gt;

&lt;h4&gt;
  
  
  Single file backup
&lt;/h4&gt;

&lt;p&gt;If the backup path points to a file, the script uses the &lt;a href="https://rclone.org/commands/rclone_copy/" rel="noopener noreferrer"&gt;rclone copy&lt;/a&gt; command to simply upload the file to the bucket.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Arguments:&lt;/span&gt;
&lt;span class="c"&gt;# - path - absolute path to backup&lt;/span&gt;
&lt;span class="c"&gt;# - name - backup file name&lt;/span&gt;
backup_file&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;

  msg &lt;span class="s2"&gt;"⬆️ Uploading file &lt;/span&gt;&lt;span class="nv"&gt;$name&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

  &lt;span class="nv"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=(&lt;/span&gt;
    &lt;span class="s2"&gt;"-P"&lt;/span&gt;
    &lt;span class="s2"&gt;"--checksum"&lt;/span&gt;
    &lt;span class="s2"&gt;"--s3-storage-class"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$storage_class&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="s2"&gt;"--s3-upload-concurrency"&lt;/span&gt; 8
    &lt;span class="s2"&gt;"--s3-no-check-bucket"&lt;/span&gt;
  &lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dry_run&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; args+&lt;span class="o"&gt;=(&lt;/span&gt;&lt;span class="s2"&gt;"--dry-run"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

  rclone copy &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"backup:&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$backup_name&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;rclone will calculate an MD5 of the file and upload it only if a file with the same name and checksum does not yet exist. This will prevent wasting time uploading the file if it's unmodified from the last time you backed it up.&lt;/p&gt;

&lt;h4&gt;
  
  
  Directory backup
&lt;/h4&gt;

&lt;p&gt;If the path points to a directory, things get more complex.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Arguments:&lt;/span&gt;
&lt;span class="c"&gt;# - path - absolute path to backup&lt;/span&gt;
&lt;span class="c"&gt;# - name - backup name, without an extension, optionally being an S3 path&lt;/span&gt;
&lt;span class="c"&gt;# - files_only - whether to backup only dir-level files, or directory as a whole&lt;/span&gt;
backup_path&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;files_only&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;3&lt;/span&gt;&lt;span class="p"&gt;-false&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

    &lt;span class="nb"&gt;local &lt;/span&gt;archive_name files &lt;span class="nb"&gt;hash &lt;/span&gt;s3_hash

    &lt;span class="nv"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s1"&gt;'s#(/(\./)+)|(/\.$)#/#g'&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s|/$||'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;     &lt;span class="c"&gt;# remove /./ and trailing /&lt;/span&gt;
    &lt;span class="nv"&gt;archive_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$backup_name&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$name&lt;/span&gt;&lt;span class="s2"&gt;.tar.gz"&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s1"&gt;'s|/(\./)+|/|g'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="c"&gt;# remove /./&lt;/span&gt;

    &lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; die &lt;span class="s2"&gt;"Can't access &lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files_only&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;msg &lt;span class="s2"&gt;"🔍 Listing files in &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;
      &lt;span class="nv"&gt;files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-maxdepth&lt;/span&gt; 1 | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/^\.\///g'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else
      &lt;/span&gt;msg &lt;span class="s2"&gt;"🔍 Listing all files under &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;
      &lt;span class="nv"&gt;files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-type&lt;/span&gt; f | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/^\.\///g'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;fi&lt;/span&gt;

    &lt;span class="c"&gt;# sort to maintain always the same order for hash&lt;/span&gt;
    &lt;span class="nv"&gt;files&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nv"&gt;LC_ALL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;C &lt;span class="nb"&gt;sort&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;msg &lt;span class="s2"&gt;"🟫 No files found"&lt;/span&gt;
      &lt;span class="k"&gt;return
    fi

    &lt;/span&gt;&lt;span class="nv"&gt;files_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;wc&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{ print $1 }'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    msg &lt;span class="s2"&gt;"ℹ️ Found &lt;/span&gt;&lt;span class="nv"&gt;$files_count&lt;/span&gt;&lt;span class="s2"&gt; files"&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files_only&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;msg &lt;span class="s2"&gt;"#️⃣ Calculating hash for files in path &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;
    &lt;span class="k"&gt;else
      &lt;/span&gt;msg &lt;span class="s2"&gt;"#️⃣ Calculating hash for directory &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;..."&lt;/span&gt;
    &lt;span class="k"&gt;fi&lt;/span&gt;

    &lt;span class="c"&gt;# replace newlines with zero byte to distinct between whitespaces in names and next files&lt;/span&gt;
    &lt;span class="c"&gt;# "md5sum --" to signal start of file names in case file name starts with "-"&lt;/span&gt;
    &lt;span class="nb"&gt;hash&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;'\n'&lt;/span&gt; &lt;span class="s1"&gt;'\0'&lt;/span&gt; | parallel &lt;span class="nt"&gt;-0&lt;/span&gt; &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="nb"&gt;md5sum&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; | &lt;span class="nb"&gt;md5sum&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{ print $1 }'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    msg &lt;span class="s2"&gt;"ℹ️ Hash is: &lt;/span&gt;&lt;span class="nv"&gt;$hash&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="nv"&gt;s3_hash&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="s2"&gt;"s3://&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$archive_name&lt;/span&gt;&lt;span class="s2"&gt;.md5"&lt;/span&gt; - 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$hash&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$s3_hash&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; aws s3api head-object &lt;span class="nt"&gt;--bucket&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--key&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$archive_name&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &amp;amp;&amp;gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;msg &lt;span class="s2"&gt;"🟨 File &lt;/span&gt;&lt;span class="nv"&gt;$archive_name&lt;/span&gt;&lt;span class="s2"&gt; already exists with the same content hash"&lt;/span&gt;
    &lt;span class="k"&gt;else
      &lt;/span&gt;msg &lt;span class="s2"&gt;"⬆️ Uploading file &lt;/span&gt;&lt;span class="nv"&gt;$archive_name&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dry_run&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;'\n'&lt;/span&gt; &lt;span class="s1"&gt;'\0'&lt;/span&gt; | xargs &lt;span class="nt"&gt;-0&lt;/span&gt; &lt;span class="nb"&gt;tar&lt;/span&gt; &lt;span class="nt"&gt;-zcf&lt;/span&gt; - &lt;span class="nt"&gt;--&lt;/span&gt; |
          rclone rcat &lt;span class="nt"&gt;-P&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
            &lt;span class="nt"&gt;--s3-storage-class&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$storage_class&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
            &lt;span class="nt"&gt;--s3-chunk-size&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;chunk_size_mb&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;Mi"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
            &lt;span class="nt"&gt;--s3-upload-concurrency&lt;/span&gt; 8 &lt;span class="se"&gt;\&lt;/span&gt;
            &lt;span class="nt"&gt;--s3-no-check-bucket&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
            &lt;span class="s2"&gt;"backup:&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$archive_name&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$hash&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; - &lt;span class="s2"&gt;"s3://&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$archive_name&lt;/span&gt;&lt;span class="s2"&gt;.md5"&lt;/span&gt;
        &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; - &lt;span class="s2"&gt;"s3://&lt;/span&gt;&lt;span class="nv"&gt;$bucket&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$archive_name&lt;/span&gt;&lt;span class="s2"&gt;.txt"&lt;/span&gt;
        msg &lt;span class="s2"&gt;"🟩 File &lt;/span&gt;&lt;span class="nv"&gt;$archive_name&lt;/span&gt;&lt;span class="s2"&gt; uploaded"&lt;/span&gt;
      &lt;span class="k"&gt;fi
    fi&lt;/span&gt;
  &lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Directory upload function starts with cleaning the path from parts like &lt;code&gt;/./&lt;/code&gt; and creating the archive name. The archive will be named just like the directory, with a &lt;code&gt;.tar.gz&lt;/code&gt; extension.&lt;/p&gt;

&lt;p&gt;The subsequent process is best explained with a flowchart:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr1tcpsv4n75muyco4e6b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr1tcpsv4n75muyco4e6b.png" alt="Directory backup process"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After the directory is compressed and uploaded, the script creates two additional text files in the S3 bucket. One contains the calculated MD5 hash of the files and the other files list.&lt;/p&gt;

&lt;h4&gt;
  
  
  Subdirectories backup
&lt;/h4&gt;

&lt;p&gt;Since we compress the directory and calculate its hash to not re-upload it unnecessary, it makes sense to archive individual subdirectories separately. This way, if the content of one of them changes, only it must be updated, not everything.&lt;/p&gt;

&lt;p&gt;At the same time, we should aim to have a smaller number of bigger archives instead of creating too many small ones. This will make the backup and restore process more effective, both in time and cost.&lt;/p&gt;

&lt;p&gt;For those reasons, an optional parameter &lt;code&gt;--split-depth&lt;/code&gt; defines how many levels down the directories tree script should go and create separate archives.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Arguments:&lt;/span&gt;
&lt;span class="c"&gt;# - path - the path relative to $root_path&lt;/span&gt;
&lt;span class="c"&gt;# - depth - the level from the $root_path&lt;/span&gt;
traverse_path&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="p"&gt;-1&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

  &lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; die &lt;span class="s2"&gt;"Can't access &lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

  backup_path &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;/_files"&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;

  &lt;span class="c"&gt;# read directories to array, taking into account possible spaces in names, see: https://stackoverflow.com/a/23357277/2512304&lt;/span&gt;
  &lt;span class="nb"&gt;local dirs&lt;/span&gt;&lt;span class="o"&gt;=()&lt;/span&gt;
  &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;$'&lt;/span&gt;&lt;span class="se"&gt;\0&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;&lt;span class="nb"&gt;dirs&lt;/span&gt;+&lt;span class="o"&gt;=(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$REPLY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;done&lt;/span&gt; &amp;lt; &amp;lt;&lt;span class="o"&gt;(&lt;/span&gt;find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-mindepth&lt;/span&gt; 1 &lt;span class="nt"&gt;-maxdepth&lt;/span&gt; 1 &lt;span class="nt"&gt;-type&lt;/span&gt; d &lt;span class="nt"&gt;-print0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;dirs&lt;/span&gt;&lt;span class="k"&gt;:-}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="c"&gt;# if dirs is not unbound due to no elements&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="nb"&gt;dir &lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;dirs&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
      if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="se"&gt;\$&lt;/span&gt;RECYCLE.BIN &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;.Trash-1000 &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;System&lt;span class="se"&gt;\ &lt;/span&gt;Volume&lt;span class="se"&gt;\ &lt;/span&gt;Information &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
        if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$depth&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; &lt;span class="nv"&gt;$split_depth&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
          &lt;/span&gt;backup_path &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$root_path&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nb"&gt;false
        &lt;/span&gt;&lt;span class="k"&gt;else
          &lt;/span&gt;traverse_path &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$path&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="k"&gt;$((&lt;/span&gt;depth &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;fi
      fi
    done
  fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each found directory is archived with the backup_path() function, the same as before. Additionally, all the files in directories above the &lt;code&gt;--split-depth&lt;/code&gt; level are archived as a &lt;code&gt;_files.tar.gz&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To illustrate this, let's take this files structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my_disk
├── browsing_history.txt
├── documents
│  ├── cv.doc
│  ├── chemtrails-evidence.pdf
│  ├── work
│  │  ├── report1.doc
│  │  └── report2.doc
│  └── personal
│     └── secret_plans.txt
├── photos
│  ├── 1947-07-02-roswell
│  │  └── evidence1.jpg
│  │  └── evidence2.jpg
│  └── 1969-07-20-moon
│     └── moon-landing-real-001.jpg
│     └── moon-landing-real-002.jpg
└── videos
   ├── area51.avi
   └── dallas-1963.avi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;code&gt;--split-depth 1&lt;/code&gt;, the disk will be backed up as four archives:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my_disk
├── _files.tar.gz
├── documents.tar.gz
├── photos.tar.gz
└── videos.tar.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And with &lt;code&gt;--split-depth 2&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my_disk
├── _files.tar.gz
├── documents
│  ├── _files.tar.gz
│  ├── work.tar.gz
│  └── personal.tar.gz
├── photos
│  ├── 1947-07-02-roswell.tar.gz
│  └── 1969-07-20-moon.tar.gz
└── videos
   └── _files.tar.gz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Example
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./backup.sh &lt;span class="nt"&gt;-b&lt;/span&gt; backuptos3-backupbucket-xxxxxxxxxxxxx &lt;span class="nt"&gt;-n&lt;/span&gt; radziki &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"/Volumes/RADZIKI"&lt;/span&gt; &lt;span class="nt"&gt;--split-depth&lt;/span&gt; 1
2022-02-18 16:30:35 🔍 Listing files &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"/Volumes/RADZIKI"&lt;/span&gt;...
2022-02-18 16:30:39 🟫 No files found
2022-02-18 16:30:39 🔍 Listing files under &lt;span class="s2"&gt;"/Volumes/RADZIKI/nat"&lt;/span&gt;...
2022-02-18 16:35:13 ℹ️ Found 55238 files
2022-02-18 16:35:13 &lt;span class="c"&gt;#️⃣ Calculating hash for directory "/Volumes/RADZIKI/nat"...&lt;/span&gt;
2022-02-18 18:11:04 ℹ️ Hash is: 989626a276bec7f0e9fb6e7c5f057fb9
2022-02-18 18:11:05 ⬆️ Uploading file radziki/nat.tar.gz
Transferred:       41.684 GiB / 41.684 GiB, 100%, 2.720 MiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed &lt;span class="nb"&gt;time&lt;/span&gt;:   1h57m45.1s
2022-02-18 20:08:58 🟩 File radziki/nat.tar.gz uploaded
2022-02-18 20:08:58 🔍 Listing files under &lt;span class="s2"&gt;"/Volumes/RADZIKI/Photo"&lt;/span&gt;...
2022-02-18 20:12:43 ℹ️ Found 42348 files
2022-02-18 20:12:43 &lt;span class="c"&gt;#️⃣ Calculating hash for path "/Volumes/RADZIKI/Photo"...&lt;/span&gt;
2022-02-18 22:19:42 ℹ️ Hash is: c3e347566fa8e12ffc19f7c2e24a1578
2022-02-18 22:19:42 ⬆️ Uploading file radziki/Photo.tar.gz
Transferred:      177.471 GiB / 177.471 GiB, 100%, 89.568 KiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed &lt;span class="nb"&gt;time&lt;/span&gt;:   8h12m19.6s
2022-02-19 06:32:02 🟩 File radziki/Photo.tar.gz uploaded
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, two directories were found in the path and uploaded.&lt;/p&gt;

&lt;p&gt;The first one was 41 GiB in total and was transferred in 1h57m. The other was 177 GiB and took 8h12m to upload. If you &lt;a href="https://www.omnicalculator.com/other/data-transfer" rel="noopener noreferrer"&gt;calculate&lt;/a&gt; that, it almost perfectly matches my internet upload speed of 50 Mbit/s.&lt;/p&gt;

&lt;h2&gt;
  
  
  Questions and answers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why IAM user?
&lt;/h3&gt;

&lt;p&gt;Generally, using IAM users is not the best practice. Instead, it's far better (and more common) to &lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#delegate-using-roles" rel="noopener noreferrer"&gt;use SSO and grant access through IAM roles&lt;/a&gt;. Furthermore, it's even safer (and more convenient) to use tools like &lt;a href="https://www.leapp.cloud/" rel="noopener noreferrer"&gt;Leapp&lt;/a&gt; or &lt;a href="https://github.com/99designs/aws-vault" rel="noopener noreferrer"&gt;aws-vault&lt;/a&gt; to manage AWS access.&lt;/p&gt;

&lt;p&gt;However, credentials obtained this way are valid for only up to 12 hours and may require manual actions like providing an MFA token to refresh them. Depending on the directory size and your internet connection, uploading backup may take longer.&lt;/p&gt;

&lt;p&gt;For that reason, we use an IAM user with static access credentials set directly in rclone. This user has access limited to only the backup bucket.&lt;/p&gt;

&lt;p&gt;Putting the user access credentials directly in the rclone configuration has one additional perk. It allows for the backup to be done in the background without affecting your other work with AWS.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why .tar.gz archives?
&lt;/h3&gt;

&lt;p&gt;If you just want to upload the whole directory recursively to the S3 bucket, rclone &lt;a href="https://rclone.org/commands/rclone_copy/" rel="noopener noreferrer"&gt;copy&lt;/a&gt; or &lt;a href="https://rclone.org/commands/rclone_sync/" rel="noopener noreferrer"&gt;sync&lt;/a&gt; commands will handle it. So why bother with compressing files into a &lt;code&gt;.tar.gz&lt;/code&gt; archive?&lt;/p&gt;

&lt;p&gt;Uploading or downloading a single archive file will be much faster than doing the same with hundreds and thousands of individual files. Since this is backup, not for regular storage, there is no requirement for quickly fetching a single file.&lt;/p&gt;

&lt;p&gt;Working on a smaller number of big files also reduces costs for S3 operations PUT and GET operations.&lt;/p&gt;

&lt;p&gt;This may be simplified in the future &lt;a href="https://github.com/rclone/rclone/issues/2815" rel="noopener noreferrer"&gt;when the rclone gets the archive capability built-in&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why stream archive?
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;.tar.gz&lt;/code&gt; archive is streamed with the pipe (&lt;code&gt;|&lt;/code&gt;) operator to the rclone &lt;a href="https://rclone.org/commands/rclone_rcat/" rel="noopener noreferrer"&gt;rcat&lt;/a&gt; command, which reads the data from the standard input and sends it to the storage. The archive file is never created on the disk. Thus you don't need free space in memory or on the disk equal to the backup file size.&lt;/p&gt;

&lt;p&gt;This, however, brings some consequences.&lt;/p&gt;

&lt;p&gt;One is that the rclone does not know the total size of the archive upfront.&lt;/p&gt;

&lt;p&gt;Sending a big file to the S3 is done with a &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html" rel="noopener noreferrer"&gt;multipart upload&lt;/a&gt;. First, the upload process is started, then the next chunks of the file are sent separately, and, finally, the transfer is completed. There is a limit, though, to 10.000 parts max. Thus each chunk must be big enough. Since rclone does not know the total file size, we must manually instruct it.&lt;/p&gt;

&lt;p&gt;This is where the previously calculated &lt;code&gt;$chunk_size_mb&lt;/code&gt; takes part. By default, it's set so the total file size limit is 1 TiB. You can use the &lt;code&gt;--max-size&lt;/code&gt; parameter to modify it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;chunk_size_mb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt;max_size_gb &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="m"&gt;10000&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The other consequence of streaming the archive file is that the rclone can't calculate its checksum before uploading.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why calculating hash?
&lt;/h3&gt;

&lt;p&gt;Typically, the rclone calculates file checksum and compares it with the checksum of the file already existing in the storage. For example, this happens when we back up a single file using &lt;code&gt;rclone copy&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;But since we are streaming the archive, rclone can't calculate the checksum on its own, so we need to do it ourselves.&lt;/p&gt;

&lt;p&gt;We never create the whole archive file locally, so we can't calculate its hash. So instead, we calculate the MD5 of all the files and then compute the MD5 of those hashes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;hash&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$files&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;'\n'&lt;/span&gt; &lt;span class="s1"&gt;'\0'&lt;/span&gt; | parallel &lt;span class="nt"&gt;-0&lt;/span&gt; &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="nb"&gt;md5sum&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; | &lt;span class="nb"&gt;md5sum&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{ print $1 }'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This hash is uploaded as a separate object next to the archive. On the next script run, the hash for the local files is re-calculated and compared with the one in the storage.&lt;/p&gt;

&lt;p&gt;Calculating MD5 for thousands of files can be time expensive. However, the biggest bottleneck of the backup process is the upload speed. Therefore, calculating the hash and possibly skipping the archive upload can reduce the total process time, especially in the case of rarely modified files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why rclone?
&lt;/h3&gt;

&lt;p&gt;rclone provides a layer on top of AWS SDK for interacting with S3 buckets. While the same could be achieved with AWS CLI, rclone displays upload progress when sending data from the input stream, and AWS CLI does not.&lt;/p&gt;

&lt;p&gt;However, &lt;strong&gt;the progress shown by the rclone is not 100% accurate&lt;/strong&gt;. &lt;a href="https://forum.rclone.org/t/multipart-upload-stops-for-few-minutes-after-4th-chunk/29122/4?u=madzikowski" rel="noopener noreferrer"&gt;rclone shows data as transferred once buffered by the AWS SDK&lt;/a&gt;, not when actually sent. For that reason, the displayed progress may go faster and slower or even stop from time to time. In reality, data is constantly uploaded.&lt;/p&gt;

&lt;p&gt;Another reason to use rclone is that, because access credentials for rclone are set separately in its config file, ongoing backup upload does not interfere with other activities and tools that use local AWS credentials. So we can work with other AWS accounts at the same time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why S3 Glacier Flexible Retrieval?
&lt;/h3&gt;

&lt;p&gt;AWS S3 has different storage classes, with various pricing per GB and operations. &lt;strong&gt;The S3 Glacier Flexible Retrieval class is over 6 times cheaper per GB stored than the default S3 Standard.&lt;/strong&gt; In the least expensive AWS regions (like &lt;code&gt;us-east-1&lt;/code&gt; or &lt;code&gt;eu-west-1&lt;/code&gt;), the storage price is &lt;strong&gt;only $3.69 per TiB per month&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;On the other hand, fetching objects from the S3 Glacier Flexible Retrieval is not instantaneous. Firstly you have to request object restore. It is free if you are okay with waiting 5-12 hours for it to be ready. Otherwise, it can be sped up by choosing the Expedited retrieval, but for $30 per TiB.&lt;/p&gt;

&lt;p&gt;An even cheaper storage class is the S3 Glacier Deep Archive. Using it can further reduce costs to only $1.01 per TiB. But, contrarily to Flexible Retrieval, it does not provide an option to retrieve the data in less than a few hours. Also, there is no free data retrieval option, although it costs only $2.56 per TiB if you are willing to wait for up to 48 hours.&lt;/p&gt;

&lt;p&gt;With personal backups in mind, data retrieval should be rare and, hopefully, not time-critical. Thus the S3 Glacier Flexible Retrieval storage class provides a reasonable balance between costs and data access options.&lt;/p&gt;

&lt;p&gt;Keep also in mind that with the S3 Glacier Flexible Retrieval, you are billed for at least 90 days of storing your objects, even if you remove them sooner.&lt;/p&gt;

&lt;p&gt;The storage class can be set with the &lt;code&gt;--storage-class&lt;/code&gt; parameter.&lt;/p&gt;

&lt;p&gt;Please see the &lt;a href="https://aws.amazon.com/s3/pricing/" rel="noopener noreferrer"&gt;S3 pricing page&lt;/a&gt; for all the details.&lt;/p&gt;

&lt;h2&gt;
  
  
  Problems and considerations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Automatic backups
&lt;/h3&gt;

&lt;p&gt;Backups are best if they are done regularly. Otherwise, you can find yourself looking for the backup to restore and finding out that the last was done a year ago.&lt;/p&gt;

&lt;p&gt;If you want to backup local files from your computer, nothing stops you from automating it with a CRON job.&lt;/p&gt;

&lt;p&gt;Unfortunately, automation can be impossible with external drives, attached only from time to time. It relies only on your good practices.&lt;/p&gt;

&lt;h3&gt;
  
  
  Backup from Google Photos (and similar)
&lt;/h3&gt;

&lt;p&gt;Apart from physical drives and local files, backing up photos and documents from services like Google Photos, Google Drive, etc., is not a bad idea either. Accidents happen, even with cloud services.&lt;/p&gt;

&lt;p&gt;As a Google Photos user, I hoped to &lt;a href="https://rclone.org/googlephotos/" rel="noopener noreferrer"&gt;fetch photos with rclone&lt;/a&gt;. Unfortunately, it's not possible to download photos in the original resolution this way, which makes it unfit for backup.&lt;/p&gt;

&lt;p&gt;The only possibility I found is to use &lt;a href="https://takeout.google.com/settings/takeout" rel="noopener noreferrer"&gt;Google Takeout&lt;/a&gt; to get the data dump from Photos, Drive, GMail, and other Google services, and then upload it to S3 with the backup script.&lt;/p&gt;

&lt;h3&gt;
  
  
  Disk operations vs power-saving mode
&lt;/h3&gt;

&lt;p&gt;Starting backup and leaving the laptop for the night may not necessarily bring the best results. I did it and was surprised to see that calculating files checksum did not complete after 8 hours. But when I re-run the script again, it finished in an hour or so.&lt;/p&gt;

&lt;p&gt;Even when not on the battery, computers tend to minimize background operations when not used.&lt;/p&gt;

&lt;p&gt;On macOS, in Settings -&amp;gt; Battery -&amp;gt; Power Adapter, you can uncheck "Put hard disks to sleep when possible". A more aggressive option is to &lt;a href="https://gist.github.com/pwnsdx/2ae98341e7e5e64d32b734b871614915" rel="noopener noreferrer"&gt;disable sleep&lt;/a&gt; altogether.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;250-lines Bash script is not a foolproof backup system. The backup is not incremental. It does not handle file modifications during the archive process.&lt;/p&gt;

&lt;p&gt;Nonetheless, this is just what I need to back up my external drive from time to time and sleep without worrying about losing data if it will not work the next time I attach it to the computer.&lt;/p&gt;

&lt;p&gt;The whole script with usage instructions is available on GitHub: &lt;a href="https://github.com/m-radzikowski/aws-s3-personal-backup" rel="noopener noreferrer"&gt;aws-s3-personal-backup&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, as I said at the beginning, I take into account that I may have reinvented the wheel. If so, please share what backup systems work for you.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>s3</category>
      <category>backup</category>
    </item>
    <item>
      <title>Headless CMS with Gatsby on AWS for $0.00 per month</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Tue, 19 Oct 2021 15:19:02 +0000</pubDate>
      <link>https://forem.com/aws-builders/headless-cms-with-gatsby-on-aws-for-000-per-month-420o</link>
      <guid>https://forem.com/aws-builders/headless-cms-with-gatsby-on-aws-for-000-per-month-420o</guid>
      <description>&lt;p&gt;Can you have a &lt;strong&gt;website with a CMS on AWS&lt;/strong&gt; and not pay just for its existence? I looked at Amazon Lightsail, headless WordPress, and Webiny CMS but &lt;strong&gt;found none of those suitable&lt;/strong&gt;. So I choose Prismic – a SaaS headless CMS, and Gatsby to create the site. Yes, I needed to make a pipeline to build my website after content changes. But when I did it, &lt;strong&gt;I got a website with CMS hosted at no cost&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  In Need of a Website
&lt;/h2&gt;

&lt;p&gt;I build serverless applications on AWS daily. But when I needed to create a simple website for my new project, &lt;a href="https://siteclue.app"&gt;SiteClue&lt;/a&gt;, I surprisingly had no solution ready off the top of my head.&lt;/p&gt;

&lt;p&gt;My requirements were simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;landing page with a &lt;strong&gt;custom HTML&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;ability to &lt;strong&gt;edit content with CMS&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;hosted on AWS&lt;/strong&gt; – to have everything in one place&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;low to no fixed costs&lt;/strong&gt; – paying for hosting a website that (for now) has almost no visits is simply against my rules&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Regular, Headless or Serverless CMS on AWS
&lt;/h2&gt;

&lt;p&gt;There are multiple ways to host a website with CMS on AWS. So why not…?&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not Amazon Lightsail with WordPress
&lt;/h3&gt;

&lt;p&gt;If you look for &lt;strong&gt;hosting a WordPress site on AWS&lt;/strong&gt;, the default recommended service is &lt;a href="https://aws.amazon.com/lightsail/"&gt;Lightsail&lt;/a&gt;. It’s an out-of-the-box solution for simple web applications. While being much less complex than EC2, you still get access to a virtual machine to install your app.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3uDTsYTH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/j428hthiewmorqcz1rz0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3uDTsYTH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/j428hthiewmorqcz1rz0.png" alt="Apache with WordPress and MySQL can be hosted on a single Amazon Lightsail instance."&gt;&lt;/a&gt;&lt;br&gt;Hosting WordPress on Lightsail can be as simple as that (&lt;a href="https://aws.amazon.com/blogs/compute/deploying-a-highly-available-wordpress-site-on-amazon-lightsail-part-1-implementing-a-highly-available-lightsail-database-with-wordpress/"&gt;source&lt;/a&gt;)
  &lt;/p&gt;

&lt;p&gt;The problem is, Lightsail pricing starts from $3.50 per month. That’s about twice what I pay for hosting this blog. And I don’t feel like paying three bucks for hosting a website that I edit once or twice per month just for the fact it exists. Serverless has spoiled me.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not Headless WordPress on AWS
&lt;/h3&gt;

&lt;p&gt;My second thought: hey, there was an article about &lt;a href="https://dev.to/aws-builders/serverless-static-wordpress-on-aws-for-0-01-a-day-1b29"&gt;Serverless Static WordPress on AWS for $0.01 a day&lt;/a&gt; not long ago. A penny a day – I can accept that. Let’s dig into this.&lt;/p&gt;

&lt;p&gt;It turns out that this solution works like that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You go to AWS CodeBuild and &lt;strong&gt;run the job&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The job starts an ECS container&lt;/strong&gt; with WordPress&lt;/li&gt;
&lt;li&gt;You log in to WordPress to &lt;strong&gt;make changes&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;You click “Generate Static Site” in WordPress to &lt;strong&gt;dump the website to the S3 bucket&lt;/strong&gt; and host it from there&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You run the job to stop the ECS container&lt;/strong&gt; (or pay for the running container that you don’t use and forgot to stop)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--eGBFHlGl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ms06l2r8nr3bwbchitgj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--eGBFHlGl--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ms06l2r8nr3bwbchitgj.png" alt="Serverless Static WordPress has an ECS container and RDS Aurora Serverless for editing content, and S3 with CloudFront and Lambda@Edge for service the website."&gt;&lt;/a&gt;&lt;br&gt;Serverless Static WordPress architecture (&lt;a href="https://www.techtospeech.com/serverless-static-wordpress-on-aws-for-0-01-a-day/https://www.techtospeech.com/serverless-static-wordpress-on-aws-for-0-01-a-day/"&gt;source&lt;/a&gt;)
  &lt;/p&gt;

&lt;p&gt;Well… That’s an interesting solution, but not appropriate for my needs.&lt;/p&gt;

&lt;p&gt;With it, you are launching the CMS only when you need it. I see several drawbacks here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You need to go to AWS to launch it and stop it.&lt;/strong&gt; Not optimal if you want to give access to CMS for someone not technical to write a new blog post.&lt;/li&gt;
&lt;li&gt;Even if you automate the above with some UI button, &lt;strong&gt;you have to wait a minute or two for the container to start.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;And lastly, you need to &lt;strong&gt;remember to stop the instance&lt;/strong&gt; to not pay for it when you no longer use it. However, you could automate this with some inactivity timeout.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In short, the whole solution is a bit too complicated for hosting a simple website, and the user experience is not the best.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not Webiny CMS
&lt;/h3&gt;

&lt;p&gt;At that point, I resigned from using WordPress and focused my search on &lt;strong&gt;“serverless CMS”&lt;/strong&gt;. Pretty soon, I came across &lt;a href="https://www.webiny.com/serverless-cms"&gt;Webiny Serverless CMS&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Webiny is an open-source project that you deploy to AWS. It takes advantage of serverless services to bring you self-hosted, customizable, and extensible CMS. It creates the infrastructure containing Lambda functions, API Gateway, DynamoDB, S3, and… Elasticsearch.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Negaaock--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/9oddvcptg599oyjowtdg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Negaaock--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/9oddvcptg599oyjowtdg.png" alt="Webiny CMS is a SPA frontend application with a bunch of Lambda functions, DynamoDB, Elasticsearch and Cognito on the backend."&gt;&lt;/a&gt;&lt;br&gt;Webiny API architecture (&lt;a href="https://www.webiny.com/docs/key-topics/cloud-infrastructure/api/overview/"&gt;source&lt;/a&gt;)
  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Almost everything deployed is serverless&lt;/strong&gt;, where you pay only for what you really use. &lt;strong&gt;Except for the Elasticsearch,&lt;/strong&gt; &lt;a href="https://aws.amazon.com/blogs/aws/amazon-elasticsearch-service-is-now-amazon-opensearch-service-and-supports-opensearch-10/"&gt;recently renamed&lt;/a&gt; to Amazon OpenSearch Service.&lt;/p&gt;

&lt;p&gt;While you can have a small Elasticsearch/OpenSearch instance free of charge in the free tier for the first year, afterward, you have to pay &lt;strong&gt;at least $13 per month&lt;/strong&gt;. I rejected Lightsail for $3.50 per month, so I won’t pay 4x as much for OpenSearch.&lt;/p&gt;

&lt;p&gt;Nonetheless, I like the idea of Webiny and see a great use for it in the future. The good news is that &lt;a href="https://github.com/webiny/webiny-js/discussions/1899#discussioncomment-1282715"&gt;they work on the ability to make Elasticsearch optional&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Headless CMS and Static Website
&lt;/h2&gt;

&lt;p&gt;Eventually, I came with the solution. So why…?&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Prismic CMS
&lt;/h3&gt;

&lt;p&gt;Finally, I looked for &lt;strong&gt;“headless CMS”&lt;/strong&gt;. I quickly found out &lt;a href="https://jamstack.org/headless-cms/"&gt;there is a ton of them&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The general idea of a headless CMS is that it provides an editor to create content and an API that exposes that content. Then &lt;strong&gt;the website fetches content from the API&lt;/strong&gt;. This is, by the way, the same workflow as with the Webiny described above.&lt;/p&gt;

&lt;p&gt;Building a website with a headless CMS can be more work than with a standard CMS like WordPress. But at the same time, it brings some benefits, which I will cover at the end.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--79J3aGeM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wrx2c5o58c0d1mlzn27m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--79J3aGeM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wrx2c5o58c0d1mlzn27m.png" alt="Prismic CMS provides a WYSIWYG editor and ability to define own custom blocks to build rich websites."&gt;&lt;/a&gt;&lt;br&gt;Prismic CMS UI (&lt;a href="https://prismic.io/non-developer"&gt;source&lt;/a&gt;)
  &lt;/p&gt;

&lt;p&gt;Some of the headless CMS services are open-source, and you host them by yourself. This is nice, but I don’t want to have an EC2 or even a Docker container running all the time.&lt;/p&gt;

&lt;p&gt;Others are SaaS products, where you edit content and have access to the API. I reviewed a few of them and ended up choosing &lt;a href="https://prismic.io/"&gt;Prismic&lt;/a&gt;. Why? I heard of it before, docs looked good, and it has a free plan for one user as well as sensible pricing when the editorial team grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Gatsby for Website
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.gatsbyjs.com/"&gt;Gatsby&lt;/a&gt; is a static site generator&lt;/strong&gt;, one of the most popular ones. I think the peak of its popularity, when everyone was talking about it, was actually some time ago, but most days, I look at the frontend world from the distance.&lt;/p&gt;

&lt;p&gt;Now, needing to build a website that will consume and display the content from an API, I decided to catch up and see for myself what’s the deal with Gatsby. &lt;strong&gt;It’s based on React and has a lot of plugins&lt;/strong&gt;, which really speeds up development. Importantly, like every other major headless CMS, &lt;strong&gt;Prismic provides a plugin for Gatsby&lt;/strong&gt;. That makes the integration effortless.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--PFgZHg_v--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7to0nu502kqiolpue2k7.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PFgZHg_v--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/7to0nu502kqiolpue2k7.jpg" alt="Great Gatsby, the movie, not the statis site generator."&gt;&lt;/a&gt;&lt;br&gt;Oops, sorry, wrong Gatsby
  &lt;/p&gt;

&lt;p&gt;Gatsby produces a static website, meaning all the content is generated/fetched during the build and converted into &lt;strong&gt;a simple HTML page you can host from an S3 bucket&lt;/strong&gt;. This gives an ultra-fast website that does not rely on any backend to make a bunch of database queries on each page view. However, on every change of content, you need to re-build the website.&lt;/p&gt;

&lt;h2&gt;
  
  
  CI Pipeline for Headless CMS and Gatsby
&lt;/h2&gt;

&lt;p&gt;With the tech stack selected, now it’s time to make it work.&lt;/p&gt;

&lt;p&gt;As mentioned above, &lt;strong&gt;you need to build the Gatsby website to update it&lt;/strong&gt;. This is necessary for two situations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you update the website code itself (e.g., website structure),&lt;/li&gt;
&lt;li&gt;or you update the content in the CMS.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because of this first case, &lt;strong&gt;I decided to build the website using GitHub Actions&lt;/strong&gt;, since this is my preferred CI system. Every time I push changes to the repository &lt;code&gt;main&lt;/code&gt; branch, it triggers the build and deployment of the website.&lt;/p&gt;

&lt;p&gt;Now, &lt;strong&gt;I needed the build on the GitHub to happen also after changes in Prismic&lt;/strong&gt;. Thankfully, in Prismic, you can add a webhook to trigger after content edits.&lt;/p&gt;

&lt;p&gt;In the perfect world, we could point the Prismic webhook directly to the GitHub API to trigger the build job. But the GitHub &lt;a href="https://docs.github.com/en/rest/reference/repos#create-a-repository-dispatch-event"&gt;&lt;code&gt;/dispatches&lt;/code&gt; endpoint&lt;/a&gt; requires an &lt;code&gt;event_type&lt;/code&gt; parameter in the payload body. Unfortunately, we can’t add it to the messages sent from Prismic. However, as you probably know, &lt;strong&gt;there is nothing we couldn’t patch with a Lambda function&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;APIGatewayProxyHandlerV2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aws-lambda/trigger/api-gateway-proxy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;axios&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prismicSecret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PRISMIC_SECRET&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;githubUser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GITHUB_USER&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;githubRepo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GITHUB_REPO&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;githubToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GITHUB_TOKEN&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;APIGatewayProxyHandlerV2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;{}&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secret&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;prismicSecret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;statusCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`https://api.github.com/repos/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;githubUser&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;githubRepo&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/dispatches`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;prismatic_update&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;githubUser&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;githubToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GitHub response&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;When invoked from the Prismic webhook, this Lambda verifies the request by checking the secret token and calls the GitHub API to start the build.&lt;/p&gt;

&lt;p&gt;The whole architecture looks as follows:&lt;/p&gt;


  &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--K_p6vOxw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/ubj15hudg6ut4h3thhg4.png" alt="Prismic webhook triggers Lambda function that starts a GitHub Actions execution. It builds and uploads the Gatsby static website to AWS S3 bucket, from where it's hosted by CloudFront."&gt;Pipeline for updating Gatsby website after changes in Headless CMS
  


&lt;p&gt;In practice, the build and deployment in the GitHub Actions are handled by Serverless Framework, which deploys AWS services along with the website.&lt;/p&gt;

&lt;p&gt;To make it all work, you need to configure few things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;in &lt;strong&gt;Prismic&lt;/strong&gt;, you need to &lt;strong&gt;create a webhook&lt;/strong&gt; providing the HTTP API URL as a target and some secret value for authorization,&lt;/li&gt;
&lt;li&gt;in &lt;strong&gt;Build Trigger Lambda&lt;/strong&gt;, you need to set:

&lt;ul&gt;
&lt;li&gt;the same secret value to validate requests,&lt;/li&gt;
&lt;li&gt;GitHub username and repository name,&lt;/li&gt;
&lt;li&gt;generated GitHub Personal Access Token to be able to dispatch builds,&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;and finally in &lt;strong&gt;GitHub&lt;/strong&gt;, you need to create repository secrets used in the build:

&lt;ul&gt;
&lt;li&gt;AWS access key ID and secret,&lt;/li&gt;
&lt;li&gt;Prismic repo name and generated API token.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And that’s it. From this moment, every published change in the Prismic will trigger the build. The website will be updated in about 3-4 minutes.&lt;/p&gt;

&lt;p&gt;You could optimize it by only building and uploading the website to the S3 bucket from the webhook instead of deploying the whole stack every time. Let’s make this a homework, shall we?&lt;/p&gt;

&lt;p&gt;You can find the link to the repository with the entire solution at the end of the article 👇&lt;/p&gt;
&lt;h2&gt;
  
  
  Real Cost of Static Website with CMS
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is it really $0.00 per month?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. But also no.&lt;/p&gt;

&lt;p&gt;There are no fixed or minimal costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prismic&lt;/strong&gt; is free for one user,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Actions&lt;/strong&gt; provide free 2000 execution minutes per month (each build takes about 4 minutes),&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP API requests&lt;/strong&gt; cost $0.000001 per invocation, so to get billed $0.01, you would need to make at least 10.000 updates,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda&lt;/strong&gt; is free up to 1M invocations and 400.000 GB-seconds of compute time per month (single execution takes milliseconds),&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S3&lt;/strong&gt; with even 50 MB website and 100 file uploads also costs below $0.01 per month.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thus you can update the website multiple times per month, and as long as no one visits it, you don’t pay a cent.&lt;/p&gt;

&lt;p&gt;On the other hand, you may eventually begin paying for the website when people start visiting the website. &lt;strong&gt;The only significant cost comes from CloudFront&lt;/strong&gt;, where you pay for requests count and total data transfer size. Pricing will depend on the CloudFront settings and the location of your visitors.&lt;/p&gt;

&lt;p&gt;I think that’s fair – you pay only for the actual usage, as with other serverless services. But, if that cost becomes a problem, you can use another CDN in that place. &lt;a href="https://www.cloudflare.com/"&gt;Cloudflare&lt;/a&gt; may be a much cheaper option here (even free).&lt;/p&gt;
&lt;h2&gt;
  
  
  Is using SaaS CMS cheating?
&lt;/h2&gt;

&lt;p&gt;Maybe.&lt;/p&gt;

&lt;p&gt;Yes, I said I wanted to have everything hosted on AWS. But that means I don’t have to deploy anything elsewhere. Using an external SaaS solution is much less of a problem for me.&lt;/p&gt;

&lt;p&gt;And yes, I’m eager to try Webiny as soon as they provide a built-in deployment without the Elasticsearch option. I will write a follow-up of this post then – subscribe to not miss it 😉&lt;/p&gt;
&lt;h2&gt;
  
  
  Pros and Cons of Headless CMS and Gatsby
&lt;/h2&gt;

&lt;p&gt;There are several benefits of this solution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It’s easy to make changes.&lt;/strong&gt; Any non-technical person can modify the content. Even a technical one wouldn’t like to change the HTML and manually update the website every time few words need to be added to the page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It’s so cheap it’s basically free.&lt;/strong&gt; You pay nothing for the existence of the website and making changes several times a month. Even with a lot of changes, it would be hard to spend more than few cents. You pay for the visits, like with any page you expose through CloudFront.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The website loads fast.&lt;/strong&gt; That’s the beauty and main idea of static site generators like Gatsby. No backend and database queries. Everything is already in HTML, hosted to you from the nearest CDN edge location.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, there are also some drawbacks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Changes are not visible immediately.&lt;/strong&gt; After every modification, the page needs to be built and deployed, which takes a moment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The CMS is not as well-known and flexible as WordPress.&lt;/strong&gt; You can be almost sure that any writer will be familiar with WordPress. Here the Prismic CMS is not (yet?) well known, although it provides all the essential features needed in most cases.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I will leave the rest of the “WordPress vs. anything else” discussion out of this list. There is no universal answer to it, and everything depends on the use case.&lt;/p&gt;
&lt;h2&gt;
  
  
  Final notes
&lt;/h2&gt;

&lt;p&gt;As always, you can find a repository with the complete example on GitHub:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--i3JOwpme--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev.to/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/m-radzikowski"&gt;
        m-radzikowski
      &lt;/a&gt; / &lt;a href="https://github.com/m-radzikowski/aws-website-cms"&gt;
        aws-website-cms
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Gatsby website with Prismic CMS automatically updated on AWS.
    &lt;/h3&gt;
  &lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;If you are interested in &lt;strong&gt;website analytics that cares about privacy and where you pay based on the actual usage&lt;/strong&gt; (just like in the solution the whole article was about), please check out the SiteClue. This post would not exist if I wouldn’t have to make this website:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://siteclue.app"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PVTDRt9j--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lgmyuiw09iathbpv42bb.png" alt="SiteClue: Intuitive web analytics with privacy at heart"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>cms</category>
      <category>gatsby</category>
    </item>
    <item>
      <title>My AWS toolbox – tools, plugins and applications</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Fri, 23 Oct 2020 15:58:19 +0000</pubDate>
      <link>https://forem.com/mradzikowski/my-aws-toolbox-tools-plugins-and-applications-1p79</link>
      <guid>https://forem.com/mradzikowski/my-aws-toolbox-tools-plugins-and-applications-1p79</guid>
      <description>&lt;p&gt;&lt;em&gt;This post was originally published at &lt;a href="https://betterdev.blog/my-aws-toolbox/" rel="noopener noreferrer"&gt;https://betterdev.blog/my-aws-toolbox/&lt;/a&gt; - check it out for more related content.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Developers, like all specialists, discover and collect their favorite tools over time. Having a good, proven set of tools makes the work easier and more pleasant. We can focus on getting the job done. Sometimes eliminating minor inconveniences or improving a small element of everyday activity makes the greatest impact on the comfort of work.&lt;/p&gt;

&lt;p&gt;It’s not always easy to find the best tools. There is a wide choice. More importantly, everyone has different habits and preferences. The best way is to test them yourself and see what suits you.&lt;/p&gt;

&lt;p&gt;To help a little bit with that, here I present a collection of my AWS tools. These are applications, plugins, and extensions that I use in my daily work with AWS.&lt;/p&gt;

&lt;h2&gt;
  
  
  CLI
&lt;/h2&gt;

&lt;h3&gt;
  
  
  AWS CLI
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;&lt;a href="https://aws.amazon.com/cli/" rel="noopener noreferrer"&gt;AWS CLI&lt;/a&gt;&lt;/strong&gt; is the obvious first position on this list. After all, sometimes it’s just quicker to do something in the CLI. Other times we need to wrap some process interacting with AWS in a simple script.&lt;/p&gt;

&lt;p&gt;The AWS CLI v2 has some nice features, such as improved command completion. I’m using a &lt;a href="https://fishshell.com/" rel="noopener noreferrer"&gt;fish shell&lt;/a&gt; in the terminal and AWS CLI &lt;a href="https://github.com/aws/aws-cli/issues/1079" rel="noopener noreferrer"&gt;does not natively provide&lt;/a&gt; command completion for it. Fortunately, fish is extremely good with completions, so the fix is quite easy. It’s enough to add one (quite long) line to the config file and it works like a charm.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;~/.config/fish/config.fish&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;test -x (which aws_completer); and complete --command aws --no-files --arguments '(begin; set --local --export COMP_SHELL fish; set --local --export COMP_LINE (commandline); aws_completer | sed \'s/ $//\'; end)'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Ffish-aws-cli-v2-completion.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Ffish-aws-cli-v2-completion.png" alt="AWS CLI v2 completion in fish"&gt;&lt;/a&gt;&lt;/p&gt;
AWS CLI v2 completion in fish



&lt;h3&gt;
  
  
  asp plugin for oh-my-fish
&lt;/h3&gt;

&lt;p&gt;As mentioned above, I’m using a fish shell. True beauty and power of it can be unlocked with &lt;a href="https://github.com/oh-my-fish/oh-my-fish" rel="noopener noreferrer"&gt;Oh My Fish&lt;/a&gt;, which is basically a plugin and theme manager for the shell.&lt;/p&gt;

&lt;p&gt;The OMF plugin I use daily when working with AWS is &lt;strong&gt;&lt;a href="https://github.com/m-radzikowski/omf-plugin-asp" rel="noopener noreferrer"&gt;asp&lt;/a&gt;&lt;/strong&gt;. It’s a small, handy plugin that allows changing the currently selected AWS profile. I took it over from the original author and I’m its maintainer right now.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Ffish-asp-plugin-300x273.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Ffish-asp-plugin-300x273.png" alt="Oh My Fish asp plugin"&gt;&lt;/a&gt;&lt;/p&gt;
Oh My Fish asp plugin



&lt;p&gt;If you are using zsh instead of fish, a &lt;a href="https://github.com/ohmyzsh/ohmyzsh/blob/master/plugins/aws/README.md" rel="noopener noreferrer"&gt;similar plugin&lt;/a&gt; exists also for &lt;a href="https://github.com/ohmyzsh/ohmyzsh" rel="noopener noreferrer"&gt;Oh My Zsh&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Infrastructure as Code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Serverless Framework
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;&lt;a href="https://www.serverless.com/" rel="noopener noreferrer"&gt;Serverless Framework&lt;/a&gt;&lt;/strong&gt; is the most basic tool for my work with AWS. The built-in functionalities and number of community plugins accelerate infrastructure development. Even when creating just “ordinary” stacks, without any Lambda functions or other plugin-driven resources, writing CloudFormation with syntax extended by Serverless (for example, with variables) is far easier.&lt;/p&gt;

&lt;p&gt;While CloudFormation is not always the best, it’s the default IaC for AWS and supported by them. The Serverless Framework is, in fact, building and deploying normal CloudFormation templates. That gives me confidence that I’m depending mostly on AWS, without additional parties. Anything that is not directly supported by Serverless or its plugins can be created using raw CloudFormation in the stack. This makes the IaC, the critical element of systems, stable and powerful.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Fserverless-stack-with-cf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Fserverless-stack-with-cf.png" alt="Sample Serverless stack with raw CloudFormation resource"&gt;&lt;/a&gt;&lt;/p&gt;
Sample Serverless stack with raw CloudFormation resource



&lt;h2&gt;
  
  
  Chrome extensions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  AWS Extend Switch Roles
&lt;/h3&gt;

&lt;p&gt;If you are working on multiple AWS accounts and/or using various roles, then you must know the pain of switching between them in the AWS Console. The site remembers your past roles, so you don’t have to provide the role name and account ID every time. At least as long as you have no more than 5 of them. That’s the limit of roles history, after which they are overridden.&lt;/p&gt;

&lt;p&gt;Here to help comes &lt;strong&gt;&lt;a href="https://chrome.google.com/webstore/detail/aws-extend-switch-roles/jpmkfafbacpgapdghgdpembnojdlgkdl" rel="noopener noreferrer"&gt;AWS Extend Switch Roles&lt;/a&gt;&lt;/strong&gt; extension. The configuration is dead simple – you just copy the content of &lt;code&gt;~/.aws/config&lt;/code&gt; file. From that point, when you click on the extension icon, you will get a nice, filterable list of all defined roles to choose from. And you can have as many of them as you need.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Faws-extend-switch-roles-plugin-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Faws-extend-switch-roles-plugin-1.png" alt="AWS Extend Switch Role extension"&gt;&lt;/a&gt;&lt;/p&gt;
AWS Extend Switch Role extension



&lt;p&gt;Available also for &lt;a href="https://addons.mozilla.org/en-US/firefox/addon/aws-extend-switch-roles3/" rel="noopener noreferrer"&gt;Firefox&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Simple Iconification Service
&lt;/h3&gt;

&lt;p&gt;This one is from the category “small but delightful”. &lt;strong&gt;&lt;a href="https://chrome.google.com/webstore/detail/aws-simple-iconification/edagjlhogddnlkbkllibfhbekpcdppbk" rel="noopener noreferrer"&gt;AWS Simple Iconification Service&lt;/a&gt;&lt;/strong&gt; extension fixes favicons in AWS Console.&lt;/p&gt;

&lt;p&gt;The fact that half of the service pages in AWS Console has one of two versions of the same default favicon is somehow astonishing. The fact that the other half has favicons in a few different styles, from 3D to flat, is just amusing. Well, we all know that the UI is not the AWS team priority, and the whole site looks a little bit like a Frankenstein’s monster.&lt;/p&gt;

&lt;p&gt;But identical or inconsistent favicons are not only hurting someone’s sensitive UI feelings. It also makes it more difficult to quickly find one of 15 currently open AWS Console tabs during development. Or, worse case, while looking for the cause of an error on the production on some pleasant Friday afternoon.&lt;/p&gt;

&lt;p&gt;With the Iconification extension, all services have their own favicons, from the official AWS architecture icons.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Faws-simple-iconification-service-comparison.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Faws-simple-iconification-service-comparison.png" alt="AWS Simple Iconification Service – favicon comparison"&gt;&lt;/a&gt;&lt;/p&gt;
AWS Simple Iconification Service – favicon comparison



&lt;p&gt;Available also for &lt;a href="https://addons.mozilla.org/pl/firefox/addon/simple-iconification-service/" rel="noopener noreferrer"&gt;Firefox&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  IDE plugins
&lt;/h2&gt;

&lt;h3&gt;
  
  
  AWS Toolkit for JetBrains
&lt;/h3&gt;

&lt;p&gt;We can argue what IDE is the best, but for me, it’s always the ones from the JetBrains stable. Thus that list could not be missing the &lt;strong&gt;&lt;a href="https://aws.amazon.com/intellij/" rel="noopener noreferrer"&gt;AWS Toolkit for JetBrains&lt;/a&gt;&lt;/strong&gt; IDE.&lt;/p&gt;

&lt;p&gt;There is a slowly growing list of services that the plugin supports. As I’m not building SAM applications, so far most useful for me are the S3, CloudWatch, and CloudFormation interfaces. Being able to operate with them directly from the IDE, sometimes easier and faster than going through the AWS Console in the browser, is really handy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Faws-toolkit-for-jetbrains.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Faws-toolkit-for-jetbrains.png" alt="AWS Toolkit for JetBrains menu"&gt;&lt;/a&gt;&lt;/p&gt;
AWS Toolkit for JetBrains menu



&lt;p&gt;The plugin works with all JetBrains IDE (IntelliJ, WebStorm, PyCharm, Rider, etc.).&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Toolkit for VS Code
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;&lt;a href="https://aws.amazon.com/visualstudiocode/" rel="noopener noreferrer"&gt;AWS Toolkit for Visual Studio Code&lt;/a&gt;&lt;/strong&gt; is a little bit younger brother of the Toolkit for JetBrains. Their development goes with similar, but not identical paths. Some features are available sooner in one of them.&lt;/p&gt;

&lt;p&gt;I’m not using VS Code on a daily basis, but the AWS Toolkit for it is one of the reasons I launch it. It provides Amazon States Language graph preview, which is a great help when working a lot with Step Functions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Fvscode-step-functions-preview.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Fvscode-step-functions-preview.png" alt="Step Function graph preview in VS Code"&gt;&lt;/a&gt;&lt;/p&gt;
Step Function graph preview in VS Code



&lt;p&gt;This will stay on the list for now, at least until the &lt;a href="https://github.com/aws/aws-toolkit-jetbrains/issues/584" rel="noopener noreferrer"&gt;same feature&lt;/a&gt; is not available in the Toolkit for JetBrains.&lt;/p&gt;

&lt;h3&gt;
  
  
  Serverless Framework plugin
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;&lt;a href="https://plugins.jetbrains.com/plugin/14537-serverless-framework-completion-navigation-syntax" rel="noopener noreferrer"&gt;Serverless Framework Completion/Navigation/Syntax&lt;/a&gt;&lt;/strong&gt; plugin for IntelliJ provides support for writing Serverless stacks. While rather basic, it can help a lot. First, it warns of references to non-existing files or resources. Furthermore, the ability to click on the resource name or path and jump straight to the code is very useful and minimizes the scrolling and clicking through files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture diagrams
&lt;/h2&gt;

&lt;p&gt;Picture tells more than a thousand words. And a good software architecture diagram can tell more than any other kind of documentation. Especially when working in microservice or serverless environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  OmniGraffle
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;&lt;a href="https://www.omnigroup.com/omnigraffle" rel="noopener noreferrer"&gt;OmniGraffle&lt;/a&gt;&lt;/strong&gt; is a paid and Mac-only application for prototyping, design, and diagramming. My case is the latter and the application does a good job in that area. After remembering only a few shortcuts the work is intuitive and fast. Even if you are pedantic like me and everything on the diagram must be exactly aligned, with OmniGraffle it’s quick to do.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Fomnigraffle-example.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Fomnigraffle-example.png" alt="OmniGraffle AWS architecture example"&gt;&lt;/a&gt;&lt;/p&gt;
OmniGraffle AWS architecture example



&lt;p&gt;The nice feature is the &lt;a href="https://stenciltown.omnigroup.com/" rel="noopener noreferrer"&gt;Stenciltown&lt;/a&gt; – community-driven library of “stencils”. Stencils are packs of graphics that you can add and use in the OmniGraffle. Apart from that, there are also paid stencils over the internet.&lt;/p&gt;

&lt;p&gt;If you use OmniGraffle and need AWS icons, &lt;a href="https://stenciltown.omnigroup.com/stencils/aws-architecture-icons-light-all-2020-04/" rel="noopener noreferrer"&gt;here&lt;/a&gt; is a stencil from me.&lt;/p&gt;

&lt;p&gt;And if you want to create a stencil on your own, here is my tool that will do it for you: &lt;a href="https://github.com/m-radzikowski/omnigraffle-stencil" rel="noopener noreferrer"&gt;OmniGraffle Stencil generator&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  diagrams.net / draw.io
&lt;/h3&gt;

&lt;p&gt;The OmniGraffle app is great to use but has several drawbacks. It’s for macOS only and paid. Sometimes you cannot expect everyone to use it.&lt;/p&gt;

&lt;p&gt;For such cases, I use &lt;strong&gt;&lt;a href="https://www.diagrams.net/" rel="noopener noreferrer"&gt;diagrams.net&lt;/a&gt;&lt;/strong&gt; (&lt;a href="https://www.diagrams.net/blog/move-diagrams-net" rel="noopener noreferrer"&gt;previously known as draw.io&lt;/a&gt;). It’s free and works in the browser, so everyone can edit the diagrams. And for Confluence, it’s really worth to buy an add-on that integrates it. Having editable diagrams in the same place as the rest of the documentation is the best thing possible.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Fdiagrams-net-example.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fbetterdev.blog%2Fapp%2Fuploads%2F2020%2F10%2Fdiagrams-net-example.png" alt="diagrams.net AWS architecture example"&gt;&lt;/a&gt;&lt;/p&gt;
diagrams.net AWS architecture example



&lt;p&gt;Sadly, in comparison with OmniGraffle, while diagrams.net win in the accessibility category, the usability and user experience is, in my opinion, worse. Not bad, just worse.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;It’s not an especially long list. There are a lot more tools, toolkits, extensions, and plugins on the internet. From quite a few that I revied and tested only the ones above survived the time trial. Maybe some list of “tools for AWS that I do not use” can appear someday?&lt;/p&gt;

&lt;p&gt;Of course, apart from AWS-related tools, there are a lot of different ones that I use. But it also may be a topic for another post.&lt;/p&gt;

&lt;p&gt;Maybe you have some tools not listed here that you find extremely useful when working with AWS? Or at least ones that solve some minor inconveniences – that’s important as well. If so, let me know in the comments, and I will be happy to check them out!&lt;/p&gt;

&lt;p&gt;Toolbox icon in the featured image made by &lt;a href="https://smashicons.com/" rel="noopener noreferrer"&gt;Smashicons&lt;/a&gt; from &lt;a href="https://www.flaticon.com/" rel="noopener noreferrer"&gt;www.flaticon.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>aws</category>
      <category>serverless</category>
      <category>tools</category>
    </item>
    <item>
      <title>⚡ Speed up everyday work with handy Git aliases</title>
      <dc:creator>Maciej Radzikowski</dc:creator>
      <pubDate>Thu, 17 Sep 2020 10:01:09 +0000</pubDate>
      <link>https://forem.com/mradzikowski/speed-up-everyday-work-with-handy-git-aliases-5ddl</link>
      <guid>https://forem.com/mradzikowski/speed-up-everyday-work-with-handy-git-aliases-5ddl</guid>
      <description>&lt;p&gt;&lt;em&gt;This post was originally published at &lt;a href="https://betterdev.blog/handy-git-aliases/"&gt;https://betterdev.blog/handy-git-aliases/&lt;/a&gt; - check it out for more related content.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Git allows us to define &lt;a href="https://git-scm.com/book/en/v2/Git-Basics-Git-Aliases"&gt;aliases&lt;/a&gt;, which are basically our own commands we can use. They may be just a calls for other commands with parameters, or even shell scripts. Possibilities are unlimited.&lt;/p&gt;

&lt;p&gt;Do you ever google for this &lt;a href="https://git-scm.com/"&gt;Git&lt;/a&gt; command you forgot every time? Often execute several commands one by one, every time in the same combination for a final effect? Or saw a really nice Git command on the internet, but with way too more flags to use it in a real-life? Git aliases are the solution.&lt;/p&gt;

&lt;p&gt;Here I will show Git aliases that I use in everyday work. With explanation.&lt;/p&gt;

&lt;h1&gt;
  
  
  Defining first Git alias
&lt;/h1&gt;

&lt;p&gt;Aliases are part of the &lt;a href="https://git-scm.com/docs/git-config"&gt;Git configuration&lt;/a&gt; that is saved to the &lt;code&gt;~/.gitconfig&lt;/code&gt; file. They can be added or modified directly by editing this file, or by executing a command that will do this for us.&lt;/p&gt;

&lt;p&gt;Let’s create an alias to the &lt;code&gt;git status&lt;/code&gt; command that will be just a &lt;code&gt;git s&lt;/code&gt; command – simple abbreviation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; alias.s status
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Alternatively, we can open the &lt;code&gt;~/.gitconfig&lt;/code&gt; file (creating it if it does not already exist) and add this config lines by hand:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[alias]&lt;/span&gt;
    &lt;span class="py"&gt;s&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;status&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;In both cases, we add new alias named &lt;code&gt;s&lt;/code&gt; for &lt;code&gt;status&lt;/code&gt; command. Since then those two calls are equal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;git status
git s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;In &lt;code&gt;~/.gitconfig&lt;/code&gt; file more aliases may be added to the &lt;code&gt;[alias]&lt;/code&gt; block, we don’t need to repeat it. We just add next lines under it.&lt;/p&gt;

&lt;h1&gt;
  
  
  Useful Git aliases for everybody
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Handy shortcuts for Git commands
&lt;/h2&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[alias]&lt;/span&gt;
    &lt;span class="py"&gt;s&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;status&lt;/span&gt;
    &lt;span class="py"&gt;c&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;commit&lt;/span&gt;
    &lt;span class="py"&gt;go&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;checkout&lt;/span&gt;
    &lt;span class="py"&gt;gob&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;checkout -b&lt;/span&gt;
    &lt;span class="py"&gt;d&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;diff&lt;/span&gt;
    &lt;span class="py"&gt;dc&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;diff --cached&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;It may look silly at first. Oh, we can save 5 letters calling &lt;code&gt;git status&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;But when you start to use those short command versions, you will find it frustrating to go back. Those versions are two times shorter (including &lt;code&gt;git&lt;/code&gt; command call at the beginning) than normal ones. It means you execute them two times faster. And for a &lt;code&gt;git status&lt;/code&gt; and &lt;code&gt;git commit&lt;/code&gt; commands, that we usually execute multiple times a day, it’s a significant change. You stop spending time typing commands, but rather analyzing the results instead. It’s also one second of distraction from work less.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Bi3ksxwW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/04z8dh15bb989g0xjgys.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Bi3ksxwW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/04z8dh15bb989g0xjgys.png" alt="carbon (3)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Alias I use to move between the branches, &lt;code&gt;git go&lt;/code&gt;, has a new alternative available from Git 2.23. It’s &lt;code&gt;git switch&lt;/code&gt;, which is dedicated to changing branches. But still, I prefer my shorter version.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beautiful and meaningful Git history log
&lt;/h2&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[pretty]&lt;/span&gt;
    &lt;span class="py"&gt;better-oneline&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"format:%C(auto)%h%d %s %Cblue[%cn]"&lt;/span&gt;

&lt;span class="nn"&gt;[alias]&lt;/span&gt;
    &lt;span class="py"&gt;tree&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;log --pretty=better-oneline --all --graph&lt;/span&gt;
    &lt;span class="py"&gt;ls&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;log --pretty=better-oneline&lt;/span&gt;
    &lt;span class="py"&gt;ll&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;log --pretty=better-oneline --numstat&lt;/span&gt;

    &lt;span class="py"&gt;details&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"!f() { git ll "$1"^.."$1"; }; f"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;For those we need to declare a new &lt;code&gt;git log&lt;/code&gt; format first, so we can use it in all three aliases.&lt;/p&gt;

&lt;p&gt;The first 3 aliases (lines #5-7) are used to show Git history in a nicer form. &lt;code&gt;git tree&lt;/code&gt; is my favorite one and most used, as it shows a branching graph similar to the one shown on GitHub or GitLab. Irreplaceable in a &lt;a href="https://nvie.com/posts/a-successful-git-branching-model/"&gt;branch-based workflow&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You can think about &lt;code&gt;git ls&lt;/code&gt; and &lt;code&gt;git ll&lt;/code&gt; as an equivalent of shell &lt;code&gt;ls&lt;/code&gt; command for files listing. The first version will show commits history in a nice, one-line-per-commit format. The second one will additionally show modified files with a number of added and removed lines in each of them.&lt;/p&gt;

&lt;p&gt;The custom format we define makes the output of all 3 commands nice and significant. We get short commit hash, message, author, and branch name pointing on that commit, if any.&lt;/p&gt;

&lt;p&gt;Last one, &lt;code&gt;git details&lt;/code&gt; (line #9), is used to show the same statistics as &lt;code&gt;git ll&lt;/code&gt;, but for a single commit. We use it providing commit reference as an argument, for example, &lt;code&gt;git details HEAD&lt;/code&gt; to show modified files in the last commit.&lt;/p&gt;

&lt;p&gt;This last alias has a different syntax than all previous. It starts with an exclamation mark (&lt;code&gt;!&lt;/code&gt;), which makes Git execute it as a shell command rather than &lt;code&gt;git&lt;/code&gt; command. The commands will be always executed in a root repository directory. Additionally, we need to wrap it in a function to be safe using positional arguments – see explanation &lt;a href="https://blog.theodo.fr/2017/06/git-game-advanced-git-aliases/"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--l_Z9jXpk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ohdbm69j4nod74j4q0id.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--l_Z9jXpk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ohdbm69j4nod74j4q0id.png" alt="carbon (4)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Plurals for listing all elements
&lt;/h2&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[alias]&lt;/span&gt;
    &lt;span class="py"&gt;branches&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;branch -a&lt;/span&gt;
    &lt;span class="py"&gt;tags&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;tag&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Those two aliases are very helpful, especially for beginners. They make listing all branches and tags easy with just a plural form of the word. Additionally, it fixes inconsistency in commands, as we see that in a normal form showing all branches and all tags is done in a different way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fast files adding and committing
&lt;/h2&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[alias]&lt;/span&gt;
    &lt;span class="py"&gt;a&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;!cd ${GIT_PREFIX:-.} &amp;amp;&amp;amp; git add . &amp;amp;&amp;amp; git s&lt;/span&gt;
    &lt;span class="py"&gt;aa&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;!git add -A &amp;amp;&amp;amp; git s&lt;/span&gt;

    &lt;span class="py"&gt;ac&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;!cd ${GIT_PREFIX:-.} &amp;amp;&amp;amp; git add . &amp;amp;&amp;amp; git c&lt;/span&gt;
    &lt;span class="py"&gt;aac&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;!git add -A &amp;amp;&amp;amp; git c&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Those commands will speed up committing but needs to be used with care.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;git a&lt;/code&gt; will add files from our current directory and subdirectories to the &lt;a href="https://softwareengineering.stackexchange.com/a/119790"&gt;index/staging area&lt;/a&gt;, ready to be committed. It will also show &lt;code&gt;git status&lt;/code&gt; right away, so you will see what is the state of your repository after the call. Remember to check if no random files were added automatically with this command by a mistake.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;git aa&lt;/code&gt; will do a very similar thing but will add all modified files from the repository, no matter from what directory you will call it. You can memorize these two commands as “add” and “add all”.&lt;/p&gt;

&lt;p&gt;The other two aliases are extended versions of previous ones. They will additionally commit changes. Using them is very helpful and time-saving, but we need to be sure only what we want is committed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--OWOTaGoS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ebo3axqzl0vu5b0gjimy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--OWOTaGoS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ebo3axqzl0vu5b0gjimy.png" alt="carbon (5)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Clearing Git workspace and index
&lt;/h2&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[alias]&lt;/span&gt;
    &lt;span class="py"&gt;unstage&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;reset HEAD&lt;/span&gt;
    &lt;span class="py"&gt;cleanout&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;!git clean -df &amp;amp;&amp;amp; git checkout -- .&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The first alias, &lt;code&gt;unstage&lt;/code&gt;, does the opposite of &lt;code&gt;git aa&lt;/code&gt;, that we defined previously. It’s a shortcut for removing all files from the index/staging area. It doesn’t undo changes done in the files. They just go back to the workspace.&lt;/p&gt;

&lt;p&gt;The next alias behaves differently. It acts on the files in the workspace (not added to the staging area with &lt;code&gt;git add&lt;/code&gt; yet). &lt;code&gt;git cleanout&lt;/code&gt; will undo all changes from the files in the workspace and remove all new, never committed files. Effectively we will get a clean repository state of the last commit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Carefully&lt;/strong&gt;, both &lt;code&gt;discard&lt;/code&gt; and &lt;code&gt;cleanout&lt;/code&gt; will cause you loos uncommitted changes!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2pW8Vklk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/56kx7b92m6wpzht4hj9g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2pW8Vklk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/56kx7b92m6wpzht4hj9g.png" alt="carbon (6)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Undoing commits and merges
&lt;/h2&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[alias]&lt;/span&gt;
    &lt;span class="py"&gt;uncommit&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;reset --soft HEAD~1&lt;/span&gt;
    &lt;span class="py"&gt;unmerge&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;reset --hard ORIG_HEAD&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;How many times did you realize that you committed or merged branches too soon? Here are commands to revert it easily. They do exactly what the names suggest.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;git uncommit&lt;/code&gt; will remove the last commit. All committed changes will go back to the index, nothing will be lost. You can change them and commit them again. This is helpful to use instead of &lt;code&gt;git commit --amend&lt;/code&gt; when we need to make some bigger modifications or wait longer before committing.&lt;/p&gt;

&lt;p&gt;The other alias does a similar thing for branch merges. When we do &lt;code&gt;git merge&lt;/code&gt; and then &lt;code&gt;git unmerge&lt;/code&gt; right away, we will go back to the repository state from before the merge. This time no files will be found in the index after the command, and if we did some merge conflicts resolution – it will be lost. It’s important to know that we can do &lt;code&gt;unmerge&lt;/code&gt; safely only right after the merge, as it uses special pointer &lt;code&gt;ORIG_HEAD&lt;/code&gt; created during the merge. It may be changed by other Git commands, so calling it in any other circumstances could lead us to lose other commits and their content.&lt;/p&gt;

&lt;p&gt;Both these commands rewrite Git history by removing commits. A good practice is not to do such operations on commits already pushed to the remote repository. In fact, without special permissions to the remote repository, we won’t be even able to push such changes. Therefore it’s best to use &lt;code&gt;uncommit&lt;/code&gt; and &lt;code&gt;unmerge&lt;/code&gt; only when we spot a mistake right after we do it, so we can fix it before it leaves our computer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Showing Git merge details
&lt;/h2&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[alias]&lt;/span&gt;
    &lt;span class="py"&gt;merge-span&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"!f() { echo $(git log -1 $2 --merges --pretty=format:%P | cut -d' ' -f1)$1$(git log -1 $2 --merges --pretty=format:%P | cut -d' ' -f2); }; f"&lt;/span&gt;
    &lt;span class="py"&gt;merge-log&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"!git ls `git merge-span .. $1`"&lt;/span&gt;
    &lt;span class="py"&gt;merge-diff&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"!git diff `git merge-span ... $1`"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Git branch merges may cause a lot of headache, especially for beginners. Here is a way to at least understand and review what was introduced by a given merge.&lt;/p&gt;

&lt;p&gt;While &lt;code&gt;merge-span&lt;/code&gt; is only a subsidiary function, &lt;code&gt;merge-log&lt;/code&gt; and &lt;code&gt;merge-diff&lt;/code&gt; show us a list of commits added by a merge and all introduced changes. We can put a merge commit hash as a parameter to review the chosen merge, otherwise, the last found merge on the current branch will be shown.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--fa15Kp-F--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/chig93o47kaylavjms3u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fa15Kp-F--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/chig93o47kaylavjms3u.png" alt="carbon (7)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Summary
&lt;/h1&gt;

&lt;p&gt;Having aliases for most common actions and using them will speed up and simplify everyday work, allowing us to get most from the version control system.&lt;/p&gt;

&lt;p&gt;There is nothing stopping you from adding more aliases. You can create them by yourself based on your most common actions. There are also plenty of other ready to use aliases created by various people. I’ve also taken some of my aliases from them and got inspired by others. Here they are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/GitAlias/gitalias"&gt;https://github.com/GitAlias/gitalias&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gist.github.com/robmiller/6018582"&gt;https://gist.github.com/robmiller/6018582&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gggritso.com/human-git-aliases"&gt;https://gggritso.com/human-git-aliases&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you have your own favorite aliases, share them in the comments.&lt;/p&gt;

</description>
      <category>git</category>
      <category>alias</category>
      <category>cli</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
