DEV Community

Vadym Kazulkin for AWS Heroes

Posted on • Edited on

2

Quarkus 3 application on AWS Lambda- Part 3 Reducing Lambda cold starts with SnapStart and DynamoDB request priming

Introduction

In the part 1 of our series about how to develop, run and optimize Quarkus web application on AWS Lambda, we demonstrated how to write a sample application which uses the Quarkus framework, AWS Lambda, Amazon API Gateway and Amazon DynamoDB. We also made the first Lambda performance (cold and warm start time) measurements and observed quite a big cold start time.

In the part 2 of the series, we introduced Lambda SnapStart and measured how its enabling reduces the Lambda cold start time by more than 50%. We also clearly observed the impact of the AWS SnapStart Snapshot tiered cache in our measurements.

In this part of our article series, we'll introduce how to apply Lambda SnapStart priming techniques by starting with DynamoDB request priming with the goal to even further improve the performance of our Lambda functions.

Sample application with the activated AWS Lambda SnapStart with using DynamoDB request priming

We'll re-use the same sample application introduced in the part 1 of our series.

Activating Lambda SnapStart is also a prerequisite for this method.

Globals:
  Function:
    Handler: io.quarkus.amazon.lambda.runtime.QuarkusStreamHandler::handleRequest
    CodeUri: target/function.zip
    Runtime: java21
    SnapStart:
     ApplyOn: PublishedVersions    
....
Enter fullscreen mode Exit fullscreen mode

This can be done in the globals section of the Lambda functions, in which case SnapStart applies to all Lambda functions defined in the AWS SAM template, or you can add the 2 lines

SnapStart:
 ApplyOn: PublishedVersions
Enter fullscreen mode Exit fullscreen mode

to activate SnapStart only for the individual Lambda function.

You can read more about the concepts behind the Lambda SnapStart in the part 2.

SnapStart and runtime hooks offer you new possibilities to create your Lambda functions for low startup latency. With the pre-snapshot hook, we can prepare our Java application as much as possible for the first call. We load and initialize as much as possible which our Lambda function needs before the snapshot is created. This technique is known as priming.

In this method I will introduce you to the priming of DynamoDB request, which is implemented in the AmazonDynamoDBPrimingResource class.

@Startup
@ApplicationScoped
public class AmazonDynamoDBPrimingResource implements Resource {

    @Inject
    private ObjectMapper objectMapper;

    @Inject
    private DynamoProductDao productDao;

    @PostConstruct
    public void init () {
        Core.getGlobalContext().register(this);
    }

    @Override
    public void beforeCheckpoint(org.crac.Context<? extends Resource> context) throws Exception {
        productDao.getProduct("0");
    }

   @Override
   public void afterRestore(org.crac.Context<? extends Resource> context) throws Exception {
   }

}
Enter fullscreen mode Exit fullscreen mode

We use Lambda SnapStart CRaC runtime hooks here. To do this, we need to declare the following dependency in pom.xml:

<dependency>
   <groupId>io.github.crac</groupId>
   <artifactId>org-crac</artifactId>
</dependency>
Enter fullscreen mode Exit fullscreen mode

AmazonDynamoDBPrimingResource class is annotated with @ Startup annotation (so that this class/bean is initialized directly when the application is started) and implements org.crac.Resource interface. The class registers itself as a CRaC resource in the init method annotated with @PostConstruct annotation. The priming itself happens in the method where we search for the product with the ID equal to 0 in the DynamoDB table. beforeCheckpoint method is a CRaC runtime hook that is invoked before creating the microVM snapshot. We are not even interested in the result of the call productDao.getProduct("0"), but with this call all required classes are loaded and instantiated and the expensive one-time initialisation of HTTP Client (default is Apache HTTP Client ) and Jackson Marshallers (for the purpose of converting Java objects to JSON and vice versa) is carried out. As this is done during the deployment phase of the Lambda function when SnapStart is activated and before the snapshot is created, the snapshot will already contain all of this. After the fast snapshot restore phase during the Lambda invoke, we will gain a lot in performance in case of cold start by priming this way (see measurements below). We therefore prime the DynamoDB request.

To ensure that only this priming takes effect, please either comment out or remove the _ @Startup_ annotation in the following AmazonAPIGatewayPrimingResource class.

Measurements of cold and warm start times of our application with Lambda SnapStart and DynamoDB request priming

In the following, we will measure the performance of our GetProductByIdFunction Lambda function, which we will trigger by invoking curl -H "X-API-Key: a6ZbcDefQW12BN56WEV318" https://{$API_GATEWAY_URL}/prod/products/1.

The results of the experiment are based on reproducing more than 100 cold starts and about 100,000 warm starts with the Lambda function GetProductByIdFunction (we ask for the already existing product with ID=1 ) for the duration of about 1 hour. We give Lambda function 1024 MB memory, which is a good trade-off between performance and cost. We also use (default) x86 Lambda architecture. For the load tests I used the load test tool hey, but you can use whatever tool you want, like Serverless-artillery or Postman.

We will measure with tiered compilation (which is default in Java 21, we don't need to set anything separately) and compilation option XX:+TieredCompilation -XX:TieredStopAtLevel=1. To use the last option, you have to set it in template.yaml in JAVA_OPTIONS environment variable as follows:

Globals:
  Function:
    Handler: io.quarkus.amazon.lambda.runtime.QuarkusStreamHandler::handleRequest
    ...
    Environment:
      Variables:
        JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1"
Enter fullscreen mode Exit fullscreen mode

Please also note the effect of the AWS SnapStart Snapshot tiered cache. This means that in the case of SnapStart activation, we get the largest cold starts during the first measurements. Due to the tiered cache, the subsequent cold starts will have lower values. For more details about the technical implementation of AWS SnapStart and its tiered cache, I refer you to the presentation by Mike Danilov: "AWS Lambda Under the Hood". Therefore, I will present the Lambda performance measurements with SnapStart being activated for all approx. 100 cold start times (labelled as all in the table), but also for the last approx. 70 (labelled as last 70 in the table), so that the effect of Snapshot Tiered Cache becomes visible to you. Depending on how often the respective Lambda function is updated and thus some layers of the cache are invalidated, a Lambda function can experience thousands or tens of thousands of cold starts during its life cycle, so that the first longer lasting cold starts no longer carry much weight.

To show the impact of the SnapStart with DynamoDB request priming, we'll also present the Lambda performance measurements without SnapStart being activated from the part 1 and with SnapStart being activated but without applying the priming techniques as measured in the part 2.

Cold (c) and warm (w) start time with tiered compilation in ms:

Scenario Number c p50 c p75 c p90 c p99 c p99.9 c max w p50 w p75 w p90 w p99 w p99.9 w max
No SnapStart enabled 3344 3422 3494 3633 3904 3907 5.92 6.83 8.00 19.46 50.44 1233
SnapStart enabled but no priming applied, all 1643 1703 1953 2007 2084 2084 5.68 6.35 7.39 16.39 49.23 1386
SnapStart enabled but no priming applied, last 70 1604 1664 1728 1798 1798 1798 5.64 6.30 7.33 15.87 47.30 1286
SnapStart enabled and DynamoDB request priming applied, all 666 720 944 1117 1317 1318 5.73 6.45 7.57 16.01 39.07 566
SnapStart enabled and DynamoDB request priming applied, last 70 646 681 774 1043 1043 1043 5.64 6.40 7.51 15.75 37.54 566

Cold (c) and warm (w) start time with -XX:+TieredCompilation -XX:TieredStopAtLevel=1 compilation in ms:

Scenario Number c p50 c p75 c p90 c p99 c p99.9 c max w p50 w p75 w p90 w p99 w p99.9 w max
No SnapStart enabled 3357 3456 3554 4039 4060 4060 6.01 6.83 8.13 19.77 53.74 1314
SnapStart enabled but no priming applied, all 1593 1625 1722 1834 1930 1930 5.55 6.21 7.16 16.08 50.44 1401
SnapStart enabled but no priming applied, last 70 1574 1621 1685 1801 1801 1801 5.55 6.20 7.16 15.14 49.23 1401
SnapStart enabled and DynamoDB request priming applied, all 636 701 943 973 1055 1055 5.50 6.20 7.21 14.66 39.07 330
SnapStart enabled and DynamoDB request priming applied, last 70 628 654 692 859 859 859 5.50 6.15 7.04 14.08 37.25 270

Conclusion

In this part of the series, we introduced how to apply Lambda SnapStart priming techniques by starting with DynamoDB request priming with the goal to even further improve the performance of our Lambda functions. We saw that by doing this kind of priming by writing some additional code we could significantly further reduce the Lambda cold start times compared to simply activating the SnapStart. More over we could significantly reduce the maximal value for the Lambda warm start times by preloading classes (as Java lazily loads classes when they are required for the first time) and doing some preinitialization work (by invoking the method to retrieve the product from the DynamoDB table by its ID) which will only happen once during the first warm execution of the Lambda function.

We also saw that -XX:+TieredCompilation -XX:TieredStopAtLevel=1 java compilation very much outperformed the tiered compilation for this type of priming.

We also clearly observed the impact of the AWS SnapStart Snapshot tiered cache in our measurements.

In the next part of our article series, we'll introduce another Lambda SnapStart priming technique which is API Gateway Request Event priming. We'll then measure the Lambda performance by applying it and compare the results with other already introduced approaches.

Heroku

Amplify your impact where it matters most — building exceptional apps.

Leave the infrastructure headaches to us, while you focus on pushing boundaries, realizing your vision, and making a lasting impression on your users.

Get Started

Top comments (0)

ACI image

ACI.dev: Fully Open-source AI Agent Tool-Use Infra (Composio Alternative)

100% open-source tool-use platform (backend, dev portal, integration library, SDK/MCP) that connects your AI agents to 600+ tools with multi-tenant auth, granular permissions, and access through direct function calling or a unified MCP server.

Check out our GitHub!