Forem: Ambar Mehrotra

Disaster Recovery - A practical guide (Part 1)

Ambar Mehrotra — Mon, 17 Jan 2022 19:09:57 +0000

What is Disaster Recovery anyway?

While dealing with Terrabytes of data every day, it is not uncommon for critical infrastructure components to run into situations that might cause data corruption and are not easy to recover from. There are many reasons or scenarios that can force an application to an inconsistent state. These might include but are not limited to:

Natural disasters like hurricanes or earthquakes leading to the entire data centre going down
A bug in the application code leading to incorrect or corrupted data
Infrastructure failure due to power outages
Cyber attacks leading to loss of data or partial data

These scenarios are commonly referred to as disasters, and the ability to recover from these disasters to a consistent state is called disaster recovery.

In this post, I am going to mostly talk about how we built disaster-recovery strategies for some common data systems like Aurora (MySQL), MongoDB, and elasticsearch. I will also talk about the challenges we faced, some common pitfalls and practical learnings that we got out of this project.

RTO/RPO

Two of the most commonly talked about terms when talking about disaster recovery are RTO and RPO. Although these terms look fancy, they are very intuitive and easy to understand if you look at the problem in a practical way.

RPO (Recovery Point Objective) - This refers to the maximum amount of data loss you are ready to bear in case of a disaster. For example, if you are okay to bear a loss of 1 day of data, your RPO will be 24 hours and this will be the frequency at which you take regular backups. Although there are many solutions to take regular backups, doing this very frequently might lead to an increase in costs.

RTO (Recovery Time Objective) - This refers to the maximum amount of time you are willing to spend in order to recover from a disaster. For example, if it takes me an hour to restore all the lost data, my RTO will be 1 hour. Generally speaking, more the amount of data, more time it will take to restore.

Defining a strategy

Most DR strategies for databases can be divided into 3 major steps:

Snapshot Creation - Refers to the ability to take snapshots at regular intervals
Snapshot Retention - Refers to retaining snapshots in a particular window, while deleting everything else. The retention policy windows can generally be divided into Incremental and Moving windows. Examples for each can be found below:
- Incremental Window
- Retain one snapshot for every month
- Retain one snapshot for every year
- Moving Window
- Retain one snapshot for each day for the last 15 days
- Retain one snapshot for each week for the last 4 weeks
Snapshot Restoration - Ability to restore the database to a specific snapshot. Unlike backup retention and retention, backup restoration should not be automated but a manual step. This means that any kind of data restoration should originate from a clear user intent for such an activity.

Let's have a look at what things we should take into consideration while designing a DR strategy and the corresponding implementations. Some of the things that we should consider are:

We should be able to rollout DR one instance at a time
Rollout should be minimally invasive and should not cause any service disruptions unless absolutely necessary
Taking regular snapshots should be automated, but restoration to a previous point in time snapshot should require manual intervention
The user making the DR plan should be able to specify windows when backup should be taken (Backup process should not cause any disruption in service)
The user should be able to specify windows for which snapshots should be retained
Should have a good balance between RTO and RPO

With the above mentioned considerations in mind, we can go ahead and design a general purpose strategy that can be implemented across cloud providers in different ways

{
  "enabled": true,
  "creationStrategy": {
    "triggerSchedule": [
      "0 1 * * ? *",
      "0 2 * * ? *"
    ]
  },
  "retentionStrategy": {
    "triggerSchedule": [
      "0 2 * * ? *"
    ],
    "rollingWindow": {
      "hours": 8,
      "days": 15,
      "weeks": 4
    },
    "incrementalWindow": {
      "init": "1514764800",
      "span": {
        "type": "month",
        "interval": 1
      }
    }
  }
}

Architecture

Cloud agnostic vs cloud specific

Our CD platform is written in Terraform and works off of a state file present in an S3 bucket. Each execution of the CD pipeline takes the desired cluster state, compares it with the existing cluster state, and tries to move the cluster from current to desired state (like a control-loop). One of the core thoughts in our mind while building our CI/CD systems was that different teams can spawn their own instances of MySQL, MongoDB, etc., without needing to worry about the cloud provider their application is running on. For example, requesting for a MySQL instance on AWS would launch an Aurora instance, while requesting for the same MySQL instance on Alicloud would launch an ApsaraDB instance.

Because of the above mentioned use-case, even our DR implementation had to be written in a way that it could be implemented differently for different cloud components across different cloud providers. The DR setup includes the following steps:

The CD pipeline creates the required crons to implement the above mentioned functionality according to the cloud specific implementation of the infrastructure component.
The cronjob or the cloud function responsible for creating or removing snapshots will internally make an API call suitable to the underlying implementation of the database. For example, the underlying implementation for an SQL database can be different for different cloud providers -- Aurora for AWS, ApsaraDB for Alicloud, CosmosDB for Azure, etc.
The implemented crons trigger at required intervals and create or delete snapshots

Snapshot Creation and Retention

For both snapshot creation and retention, we need the ability of running an automated job at regular intervals, which can take new snapshots or remove existing snapshots. The frequency at which this job runs can be defined in our DR strategy and its value can be decided based on our RTO and RPO objectives

Snapshot Restoration

As mentioned before, the restoration of a database from a snapshot should require a manual user intent and the API and UI for the same should be limited by proper access control. The restoration flow can look something like this:

User marks a snapshot as an active candidate for restoration via the API or UI
The CI system should validate the user request and access level, and store the user intent in DB
The user intent should be passed to the CD pipeline in the next run. This can work in a pull or a push model.
The CD pipeline should infer that the desired state requires a database to be restored from a snapshot and take actions accordingly to restore data

In the interest of not making this post too long, I will be describing the exact implementation we chose for MySQL, MongoDB, and Elasticsearch in the next part of this series. The next part will also consist of our learnings while implementing DR for each of these database types. Stay tuned and cheers :)

Elasticsearch Backup and Restore with AWS S3 in Kubernetes

Ambar Mehrotra — Sat, 13 Jun 2020 18:36:25 +0000

In my day job, I get a chance of working with things like Docker, Kubernetes, Terraform, and various cloud components across cloud providers. We have multiple Elasticsearch clusters running inside our Kubernetes cluster (EKS). These Elasticsearch clusters have been installed using the well-known package manager for Kubernetes -- Helm as Helm charts. Recently, I had to set up a disaster-recovery strategy for these Elasticsearch clusters to restore these clusters to a previous stable state in case of a failure.

The process involved taking regular snapshots of the Elasticsearch cluster and backing them up in an S3 bucket. These backups can later be used to restore the cluster state at a given point in time in case of a disaster. Although the process was not that complicated and was more or less documented, I still had to google some configuration options for it to get to work properly, so I thought of just mentioning the exact necessary steps in a small blog post.

NOTE: If you are using Elasticsearch version 7.5 and above, Elasticsearch has a pretty great module called Snapshot Lifecycle Management and I suggest you check that out.

The main idea behind the setup goes like the following:

Configure the S3 repository plugin for the Elasticsearch cluster
Call the ES snapshot API at regular intervals to take incremental snapshots
Use the restore API to restore the indexes or cluster state from these backups

The steps for setting for achieving the above-mentioned goals can be divided into 3 main parts:

Enable the S3 repository plugin

Enabling plugins in Elasticsearch requires a restart of the ES cluster. Therefore, the official documentation suggests creating a custom Docker image that installs the S3 plugin inside the docker image itself. According to the docs:

There are a couple of reasons we recommend this.

Tying the availability of Elasticsearch to the download service to install plugins is not a great idea or something that we recommend. Especially in Kubernetes where it is normal and expected for a container to be moved to another host at random times.

Mutating the state of a running Docker image (by installing plugins) goes against best practices of containers and immutable infrastructure.

So, to build a docker image with s3 repository plugin enabled, you can use the following Dockerfile:

ARG elasticsearch_version
FROM docker.elastic.co/elasticsearch/elasticsearch:${elasticsearch_version}

RUN bin/elasticsearch-plugin install --batch repository-s3

Enabling plugins in ES requires extra permissions, the --batch flag tells ES to give any required permissions for the installation of the plugin without prompting for confirmation.

Configure Elasticsearch to use S3 bucket for storing snapshots

There are many parameters you can adjust while registering an S3 bucket for storing Elasticsearch snapshots and for a complete set of features you can take a look at the official documentation. For a basic setup, you can register the S3 bucket by making a curl call to the repository endpoint of ES:

PUT _snapshot/my_s3_repository
{
  "type": "s3",
  "settings": {
    "bucket": "my_bucket_name",
    "another_setting": "setting_value"
  }
}

Configure permissions that allow Elasticsearch pod to access the S3 bucket

Thanks to amazing projects like kube2iam that help you easily provide required IAM access to individual Kubernetes objects, this job has become quite easy. The helm chart for Elasticsearch has the provision of taking podAnnotations as an input. These annotations are applied to the Elasticsearch pods and can leverage the full functionality of kube2iam for accessing the S3 bucket.

podAnnotations:  
  iam.amazonaws.com/role: "my-iam-role"

The corresponding IAM role can be easily generated using AWS clients like boto3 or AWS plugins in Terraform, or any other AWS client at your disposal.

Informing the Elasticsearch Helm chart about ES version

This was one of the settings that were not mentioned in the plugins documentation in a straightforward manner and I had to search around a bit to figure this out. You need to set the esMajorVersion flag as well in case you are using a custom image and not running the default Elasticsearch version. For example, I had to set esMajorVersion: 6 as I was running version 6.3.1 of Elasticsearch.
You can have a look at the Elasticsearch statefulset for checking the exact usage of this flag.

That's it, now we are ready to take Elasticsearch snapshots or restore from them.

Taking Snapshots

This part is pretty straightforward. Elasticsearch provides a snapshot API which can be triggered to take backups of the entire cluster state or specific indexes.

For snapshots of the entire cluster, you can use the following curl call

PUT /_snapshot/my_backup/snapshot_1?wait_for_completion=true

You can also specify exact indexes that you want to take backup of:

PUT /_snapshot/my_backup/snapshot_2?wait_for_completion=true
{
  "indices": "index_1,index_2",
  "ignore_unavailable": true,
  "include_global_state": false,
  "metadata": {
    "taken_by": "kimchy",
    "taken_because": "backup before upgrading"
  }
}

Once a snapshot is created information about this snapshot can be obtained using the following command:

GET /_snapshot/my_backup/snapshot_1

Also, for automating the process of taking regular backups, you can use Kubernetes cronjobs for periodically making these API calls to the Elasticsearch backup endpoint.

Restoring from a snapshot

The restore API is pretty simple as well. By default, all indices in the snapshot are restored, and the cluster state is not restored. You can make the following curl call for restoring from a snapshot:

POST /_snapshot/my_backup/snapshot_1/_restore

You can also provide index level information while restoring from a snapshot:

POST /_snapshot/my_backup/snapshot_1/_restore
{
  "indices": "index_1,index_2",
  "ignore_unavailable": true,
  "include_global_state": false,              
  "rename_pattern": "index_(.+)",
  "rename_replacement": "restored_index_$1",
  "include_aliases": false
}

The restore operation can be performed on a functioning cluster. However, an existing index can be only restored if it’s closed and has the same number of shards as the index in the snapshot.

That's All Folks!

Happy Coding! Cheers :)

An Introduction to Kubernetes Health Checks - Readiness Probe (Part II)

Ambar Mehrotra — Sat, 06 Jun 2020 13:59:54 +0000

It's been a long time since I wrote, and the post on Kubernetes Readiness probes has been long overdue. If you haven't checked out the first part of this post on Kubernetes Liveness Probes, I suggest you to check that out. In this post, we will be looking mainly at the Readiness Probe and how it can be used to monitor the health of your applications.

As discussed earlier, Kubernetes provides 3 different kinds of health checks to monitor the state of your applications:

Liveness Probe
Readiness Probe
Startup Probe

When you are working with cloud applications, you might come across scenarios when one or more instances of your application might not be ready to serve any requests. In such scenarios, you would preferably not want the traffic to be routed to those instances. Some of these scenarios include but are not limited to:

One of your application instances might be performing a batch operation periodically -- like reading a large SQL table and writing the results to S3
Your application instances might be loading data from a DB to a cache on startup, and you do not want them to serve any traffic until the cache is populated
You might not want your application to serve any traffic if some of the dependent services are down -- for example, if you have an image processing service that works off of files in Amazon S3, you might want to stop directing any traffic to your image processing service if S3 itself is down.

Note: In the above scenario, it is advisable to configure your readiness probe in a way that it is able to differentiate between the dependent service being unavailable vs having latency issues. For example, you would not want your service to stop serving requests if S3 has an increased latency of 100ms.

In most of the scenarios mentioned above, you don’t want to kill the application, but you don’t want to send it requests either. Kubernetes provides Readiness probes to detect and mitigate these situations. Readiness probes can be used by your application to tell Kubernetes that it is not ready to accept any traffic at the moment.

According to the Kbernetes Documentation:

The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers

What this essentially means is that when the Readiness probe fails for a particular pod of your application, Kubernetes removes that pod from the service mapping and stops forwarding any traffic to it.

Anatomy of a Readiness Probe

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: readiness
  name: readiness-exec
spec:
  containers:
  - name: readiness
    image: k8s.gcr.io/busybox
    args:
    - /bin/sh
    - -c
    - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
    readinessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5
      periodSeconds: 5

If you look at the readinessProbe section of the yaml, you can see that the kubelet performs a cat operation on the /tmp/healthy file. If the file is present and the cat operation is successful, the command returns with exit status 0, and the kubelet considers the container to be available and ready to accept traffic. On the other hand, if the command returns with a non zero exit status, kubelet removes the container from the Service/LoadBalancer until the readiness probe succeeds again. No traffic is forwarded to this container until it starts returning success again.

The initialDelaySeconds parameter tells the kubelet that it should wait for 5 seconds before performing the first readiness check. This ensures that the container is not considered to be in an unavailable state when it is booting up. After the initial delay, the kubelet performs the readiness check every 5 seconds as defined by the periodSeconds field.

Apart from generic commands, a Readiness probe can also be defined over TCP and HTTP endpoints which are specially helpful if you are developing web applications.

TCP readiness probe

apiVersion: v1
kind: Pod
metadata:
  name: goproxy
  labels:
    app: goproxy
spec:
  containers:
  - name: goproxy
    image: k8s.gcr.io/goproxy:0.1
    ports:
    - containerPort: 8080
    readinessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20

This kind of readiness probe is basically a port check. If you want to check if a particular port on your web application is responsive or not, this is the way to go.

HTTP readiness probe

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: readiness
  name: readiness-http
spec:
  containers:
  - name: readiness
    image: k8s.gcr.io/readiness
    args:
    - /server
    readinessProbe:
      httpGet:
        path: /readiness
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3

For an HTTP readiness probe, kubelet polls the endpoint of the container as defined by the path and port parameters in the yaml. If the endpoint returns a success status code, the container is considered healthy.

Any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure

Conclusion

In this post we looked at certain scenarios where you might not want an instance of your application to be available to serve requests, and how Kubernetes Liveness Probe helps you identify and mitigate such scenarios effectively. Stay healthy and stay tuned.

Happy Coding! Cheers :)

An Introduction to Kubernetes Health Checks - Liveness Probe (Part I)

Ambar Mehrotra — Sun, 19 Apr 2020 17:43:26 +0000

This post was originally published on my blog: https://ambar.dev/kubernetes-liveness-probe.html

It was not very long ago when we were deploying individual services on each Virtual Machine. This process required the engineer in charge of the deployment process to be aware of all the machines where each service was deployed. Sure, people had built great solutions around this deployment model like tagging their EC2 machines with special names and using automation tools like Rundeck, Jenkins, etc., to automate the deployment process. Although this process had matured to a great extent over several years, it still had its shortcomings like -- random application crashes, inefficient deployment practices, poor resilience to failures, improper resource utilization, and bad practices around secret and configuration management.

The rise of Docker and Kubernetes

In order to solve the above-mentioned problems, people started building solutions around container environments like Docker and Kubernetes which not only solved the above-mentioned problems but also provided other benefits. One of the major benefits of using a platform like Kubernetes is that it provides self-healing capabilities to your application. According to the Kubernetes documentation, self-healing can be defined as:

Kubernetes restarts containers that fail, replaces containers, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.

What this basically means is, if your application for some reason goes into a state where it cannot perform it's desired function, Kubernetes will try to replace the crashing instance with a new one until it succeeds. Well, how does Kubernetes know that a pod (A Pod is the basic execution unit of a Kubernetes application) is not in a healthy state, or whether it is ready to handle any extra workload at the moment? Kubernetes solves this problem with the help of health checks. Kubernetes has 2 types of health checks that it uses to determine the health of a running pod -- Liveness Probe and Readiness Probe. In this first part, we will take a look at how the liveness probe works and how we can use it to keep our applications healthy.

Liveness Probe

Developers and engineers often make mistakes. Sometimes, these mistakes don't get caught in our nightly or staging environments and might spill over to production. Often, these mistakes result in applications that get stuck in tricky situations and hence cannot perform their designated operations as usual. Sometimes, these corner cases can cause the application to crash during the most unexpected of circumstances when it is not possible for an engineer to take a look and correct it.

Some of the corner cases might include the following:

An application not responding because of a deadlock
Null Pointer Exceptions causing the application to crash
Out of Memory (OOM) errors causing the application to crash

Often, applications stuck in these states need a restart to start functioning correctly again. The kubelet uses liveness probes to check if the application is alive and behaving correctly to know when to restart a container. Let us look at an example to see what parameters are involved in a liveness probe.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-exec
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/busybox
    args:
    - /bin/sh
    - -c
    - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
    livenessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5
      periodSeconds: 5

If you look at the livenessProbe section of the yaml, you can see that the kubelet performs a cat operation on the /tmp/healthy file. If the file is present and the cat operation is successful, the command returns with exit status 0, and the kubelet considers the container to be in healthy state. On the other hand, if the command returns with a non zero exit status, kubelet kills the container and restarts it.

The initialDelaySeconds parameter tells the kubelet that it should wait for 5 seconds before performing the first liveness check. This ensures that the container is not considered to be in a crashing state when it is booting up. After the initial delay, the kubelet performs the liveness check every 5 seconds as defined by the periodSeconds field.

When the container starts, it executes the command touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 that can be divided into the following parts which are performed in the mentioned order:

Create the file /tmp/healthy
Go to sleep for 30s
Delete the earlier created file /tmp/healthy
Go to sleep for 600s

After the file /tmp/healthy is deleted, the liveness probe will start failing and the liveness probe will start returning an error code back to the kubelet. On detecting the failure, the kubelet will kill the existing container and replace it with a new one. The kubelet will keep doing this unless the liveness probe succeeds. You can run the command kubectl describe po liveness-exec to view the pod events.

As you can see, when the kubelet found the pod to be unhealthy 3 consecutive times over a period of 14 seconds, it marked the pod as unhealthy and went ahead to restart it. Apart from generic commands, a Liveness probe can also be defined over TCP and HTTP endpoints which are especially helpful if you are developing web applications.

TCP liveness probe

apiVersion: v1
kind: Pod
metadata:
  name: goproxy
  labels:
    app: goproxy
spec:
  containers:
  - name: goproxy
    image: k8s.gcr.io/goproxy:0.1
    ports:
    - containerPort: 8080
    livenessProbe:
      tcpSocket:
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20

This kind of liveness probe is basically a port check. If you want to check if a particular port on your web application is responsive or not, this is the way to go.

HTTP liveness probe

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/liveness
    args:
    - /server
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3

For an HTTP liveness probe, kubelet polls the endpoint of the container as defined by the path and port parameters in the yaml. If the endpoint returns a success status code, the container is considered healthy.

Any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure

Conclusion

In this post we saw what were the problems with the traditional approach to deploying and monitoring applications, what are the solutions that Docker and Kubernetes provide for handling the issues, and how the Liveness Probe helps resolve these issues. In the next post, we will take a look at the other kind of Kubernetes Health Check -- Readiness Probe. Stay healthy and stay tuned.

Happy Coding! Cheers :)

Installing Elasticsearch inside a Kubernetes cluster with Helm and Terraform

Ambar Mehrotra — Fri, 03 Apr 2020 14:13:36 +0000

This post was originally published on my blog: Installing Elasticsearch inside a Kubernetes cluster with Helm and Terraform

Github Repository: tf-helm-kubernetes-elasticsearch

Note:
This guide uses Terraform for making API calls and state management. If you have helm installed on your machine, you can use that instead for installing the chart.

What is Elasticsearch?

According to the Elasticsearch website:

Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured.

Elasticsearch is generally used as the underlying engine for platforms that perform complex text search, logging, or real-time advanced analytics operations. The ELK stack (Elasticsearch, Logstash, and Kibana) has also become the de facto standard when it comes to logging and it's visualization in container environments.

Architecture

Before we move forward, let us take a look at the basic architecture of Elasticsearch:

The above is an overview of a basic Elasticsearch Cluster. As you can see, the cluster is divided into several nodes. A node is a server (physical or virtual) that stores some data and is a part of the elasticsearch cluster. A cluster, on the other hand, is a collection of several nodes that together form the cluster. Every node in turn can hold multiple shards from one or multiple indices. Different kinds of nodes available in Elasticsearch are Master-eligible node, Data node, Ingest node, and Machine learning node(Not availble in the OSS version). In this article, we will only be looking at the master and data nodes for the sake of simplicity.

Master-eligible node

A node that has node.master flag set to true, which makes it eligible to be elected as the master node which controls the cluster. One of the master-eligible nodes is elected as the Master via the master election process. Following are few of the functions performed by the master node:

Creating or deleting an index
Tracking which nodes are part of the cluster
Deciding which shards to allocate to which nodes

Data node

A node that has node.data flag set to true. Data nodes hold the shards that contain the documents you have indexed. These nodes perform several operations that are IO, memory, and CPU extensive in nature. Some of the functions performed by data nodes are:

Data related operations like CRUD
Search
Aggregations

Terminology

Now that we have a basic idea about the Elasticsearch Architecture, let us see how to Elasticsearch inside a Kubernetes Cluster using Helm and Terraform. Before moving forward, let us go through some basic terminology.

Kubernetes: Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation

Helm: Helm is an application package manager running atop Kubernetes. It allows describing the application structure through convenient helm-charts and managing it with simple commands

Terraform: Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing and popular service providers as well as custom in-house solutions

Installation

First, let us describe the variables and the default values needed for setting up the Elasticsearch Cluster:

Default Values:

variable "elasticsearch" {
  type = object({
    master_node = object({
      volume_size   = number
      cpu           = number
      memory        = number
    })

    data_node = object({
      volume_size   = number
      cpu           = number
      memory        = number
    })
  })

  default = {
    master_node = {
      volume_size   = 20
      cpu           = 1
      memory        = 1.5
    }

    data_node = {
      volume_size   = 20
      cpu           = 1
      memory        = 1.5
    }
  }
}

variable "kubeconfig_file_path" {
  type      = string
  default   = "/my/file/path"
}

For the sake of simplicity, I will assume that you have a working helm installtion. Although, you can still go over to the Github Repository to take a look at how to install helm and tiller onto your Kubernetes cluster using Terraform.

Terraform Helm Setup

This step involves declaring a helm provider and the elasticsearch helm repository to pull the helm chart from

provider "helm" {
  kubernetes {
    config_path = var.kubeconfig_file_path
  }
  version = "~> 0.10.4"
  service_account = kubernetes_service_account.tiller.metadata[0].name
  install_tiller = true
}

data "helm_repository" "stable" {
  name = "elastic"
  url  = "https://helm.elastic.co"
}

Setting up Master Eligible and Data nodes

Let us take a look at some of the important fields used in the following helm release resources:

clusterName - This refers to the name of the elasticsearch cluster and has the default value of elasticsearch. Because elasticsearch looks at the cluster name when joining a new node, it is better to set the value of this field to something else.
nodeGroup - This tells the elasticsearch helm chart whether the node is a master eligible node or a data node
storageClassName - The kubernetes storage class that you want to use for provisioning the attached volumes. You can skip this field if your cloud provider has a default storageclass object defined
cpu: The number of CPU cores you want to give to the elasticsearch pod
memory: The amount of memory you want to allocate to the elasticsearch pod

Master Eligible Nodes

resource helm_release "elasticsearch_master" {
  name       = "elasticsearch-master"
  repository = data.helm_repository.stable.metadata[0].name
  chart      = "elasticsearch"
  version    = "7.6.1"
  timeout    = 900

  values = [
    <<RAW_VALUES
volumeClaimTemplate:
  accessModes: [ "ReadWriteOnce" ]
  storageClassName: "my-storage-class"
  resources:
    requests:
      storage: ${var.elasticsearch.master_node.volume_size}Gi
resources:
  requests:
    cpu: ${var.elasticsearch.master_node.cpu}
    memory: ${var.elasticsearch.data_node.memory}Gi
roles:
  master: "true"
  ingest: "false"
  data: "false"
RAW_VALUES
  ]

  set {
    name  = "imageTag"
    value = "7.6.2"
  }

  set {
    name  = "clusterName"
    value = "elasticsearch-cluster"
  }

  set {
    name  = "nodeGroup"
    value = "master"
  }
}

Data Nodes

resource helm_release "elasticsearch_data" {
  name       = "elasticsearch-data"
  repository = data.helm_repository.stable.metadata[0].name
  chart      = "elasticsearch"
  version    = "7.6.1"
  timeout    = 900

  values = [
    <<RAW_VALUES
volumeClaimTemplate:
  accessModes: [ "ReadWriteOnce" ]
  storageClassName: "my-storage-class"
  resources:
    requests:
      storage: ${var.elasticsearch.data_node.volume_size}Gi
resources:
  requests:
    cpu: ${var.elasticsearch.data_node.cpu}
    memory: ${var.elasticsearch.data_node.memory}Gi
roles:
  master: "false"
  ingest: "true"
  data: "true"
RAW_VALUES
  ]

  set {
    name  = "imageTag"
    value = "7.6.2"
  }

  set {
    name  = "clusterName"
    value = "elasticsearch-cluster"
  }

  set {
    name  = "nodeGroup"
    value = "data"
  }
}

Happy Coding! Cheers :)

Setting up a VPN connection between AWS and Alicloud using Terraform

Ambar Mehrotra — Sat, 28 Mar 2020 18:15:06 +0000

Github Repository: tf-aws-alicloud-vpn

Note:
This is not a guide on the internals of a Virtual Private Network. Rather, this post outlines how to setup a VPN connection between AWS and Alicloud. This guide uses Terraform for making API calls and state management. You can chose to use any HTTP client or aws and alicloud CLIs as well for making the same API calls and end up with a working VPN connection.

Problem Statement

When you are working in a multicloud environment, many scenarios involve establishing a communication channel between services and resources that lie across cloud providers. For example, you might have a common Rundeck machine that deployes the build binaries onto virtual machines residing in AWS as well as Azure. Another example might be a script in your CI/CD platform that interacts periodically with resources across cloud providers like RDS, Mongo, RabbitMQ, etc., for regularly monitoring or updating different ACL Policies.

Creating a VPN connection helps you securely access resources on one cloud provider from another over an encrypted connection. A VPN connection helps you avoid the hassle of exposing public endpoints for each resource and then securing it. You can simply go ahead and whietelist a CIDR block across the VPCs and all your traffic in the given CIDR range will then be routed over this secure, encrypted connection.

VPN Setup

Setting up a VPN connection mainly involves setting up the following components in both AWS and Alicloud:

VPN Gateway
Customer Gateway
VPN Connection
Connection Route

First and foremost, following are the cluster specific variables that we will need for AWS and Alicloud:

Variables and Cluster Definition

# Default region: Singapore
variable "aws_vpc" {
  type = object({
    region    = string
    profile   = string
    vpc_id    = string
    cidr      = string
    subnet_id = string
  })

  default = {
    region    = "ap-southeast-1"
    profile   = "aws-profile"
    vpc_id    = "123456789"
    cidr      = "172.10.0.0/16"
    subnet_id = "subnet-123"
  }
}

# Default region: Singapore
# vswitch: AWS subnet equivalent in Alicloud
variable "alicloud_vpc" {
  type = object({
    region      = string
    profile     = string
    vpc_id      = string
    cidr        = string
    vswitch_id  = string
  })

  default = {
    region      = "ap-southeast-1"
    profile     = "alicloud-profile"
    vpc_id      = "987654321"
    cidr        = "172.20.0.0/16"
    vswitch_id  = "vswitch-123"
  }
}

Terraform Providers for AWS and Alicloud

provider "aws" {
  region  = var.aws_vpc.region
  version = "~> 2.45.0"
  profile = var.aws_vpc.profile
}

provider "alicloud" {
  region  = var.alicloud_vpc.region
  version = "1.71.1"
  profile = var.alicloud_vpc.profile
}

The first step is creating VPN Gateways in both Alicloud and AWS:

resource "alicloud_vpn_gateway" "aws_vpn_gateway" {
  name                 = "AWS-VPN-Gateway"
  vpc_id               = var.alicloud_vpc.vpc_id
  bandwidth            = "10"
  enable_ssl           = false
  instance_charge_type = "PostPaid"
  description          = "AWS-VPN-Gateway"
  vswitch_id           = var.alicloud_vpc.vswitch_id
}

resource "aws_vpn_gateway" "alicloud_vpn_gateway" {
  vpc_id = var.aws_vpc.vpc_id

  tags = {
    Name = "Alicloud-VPN-GW"
  }
}

VPN Setup in AWS

Creating the VPN Gateway will give us a publically accessible IP address of that gateway. In the first step, we will use the IP address of the Alicloud VPN Gateway to setup AWS side of things. Later on, we will repeat the same process in for Alicloud as well.

AWS Customer Gateway

According to AWS:

A customer gateway is a resource in AWS that provides information to AWS about your Customer Gateway Device

A Customer Gateway basically lets AWS know about the remote/destination address where the traffic should be forwarded if the destination IP belongs to the Alicloud CIDR range

resource "aws_customer_gateway" "alicloud_vpn_gw" {
  bgp_asn    = 65000
  ip_address = alicloud_vpn_gateway.aws_vpn_gateway.internet_ip
  type       = "ipsec.1"

  tags = {
    Name = "alicloud-customer-gateway"
  }
}

VPN Connection

A VPN Connection resource in AWS creates 2 Tunnels between your VPC and the remote network (Alicloud Network represented by customer_gateway_id in this case). AWS will create 2 tunnels for redundancy. In case one of the tunnels goes down, the traffic is automatically routed through the other tunnel

resource "aws_vpn_connection" "alicloud_vpn_connection" {
  vpn_gateway_id      = aws_vpn_gateway.alicloud_vpn_gateway.id
  customer_gateway_id = aws_customer_gateway.alicloud_vpn_gw.id
  type                = "ipsec.1"
  static_routes_only  = true
}

VPN Connection Route Entry

This entry tells the VPN connection created in the previous step about the CIDR range of the destination

resource "aws_vpn_connection_route" "alicloud" {
  destination_cidr_block = var.alicloud_vpc.cidr
  vpn_connection_id      = aws_vpn_connection.alicloud_vpn_connection.id
}

AWS Route Table Modification

Next we need to fetch the route table of the private subnet and modify the route table to tell AWS to forward all the traffic ,belonging to the CIDR range of the destination, to the VPN Gateway that we created above

data "aws_route_table" "aws_private_subnet_rt" {
  subnet_id = var.aws_vpc.subnet_id
}

resource "aws_route" "r" {
  route_table_id            = data.aws_route_table.aws_private_subnet_rt.id
  destination_cidr_block    = var.alicloud_vpc.cidr
  gateway_id = aws_vpn_gateway.alicloud_vpn_gateway.id
}

Once the AWS setup is done, we are going to repeat the same steps for Alicloud as well. I am not going to explain the terminologies again for Alicloud as they are more or less the same.

VPN Setup in Alicloud

First of all, we will create 2 customer gateways in Alicloud - one for each of the Tunnels created by the VPN Connection in AWS. The ip_address parameter will contain the IP address of each of the tunnels

Customer Gateway

resource "alicloud_vpn_customer_gateway" "aws_customer_gateway_1" {
  name        = "AWSCustomerGateway1"
  ip_address  = aws_vpn_connection.alicloud_vpn_connection.tunnel1_address
  description = "AWSCustomerGateway1"
}

resource "alicloud_vpn_customer_gateway" "aws_customer_gateway_2" {
  name        = "AWSCustomerGateway2"
  ip_address  = aws_vpn_connection.alicloud_vpn_connection.tunnel2_address
  description = "AWSCustomerGateway2"
}

VPN Connection

# `effect_immediately` parameter determines weather to delete a successfully negotiated IPsec tunnel and initiate a negotiation again
resource "alicloud_vpn_connection" "ipsec_connection_1" {
  name                = "IPSecConnection1"
  vpn_gateway_id      = alicloud_vpn_gateway.aws_vpn_gateway.id
  customer_gateway_id = alicloud_vpn_customer_gateway.aws_customer_gateway_1.id
  local_subnet        = [var.alicloud_vpc.cidr]
  remote_subnet       = [var.aws_vpc.cidr]
  effect_immediately  = true
  ike_config {
    ike_auth_alg  = "sha1"
    ike_enc_alg   = "aes"
    ike_version   = "ikev1"
    ike_mode      = "main"
    ike_lifetime  = 86400
    psk           = aws_vpn_connection.alicloud_vpn_connection.tunnel1_preshared_key
    ike_pfs       = "group2"
    ike_local_id = alicloud_vpn_gateway.aws_vpn_gateway.internet_ip
    ike_remote_id = aws_vpn_connection.alicloud_vpn_connection.tunnel1_address
  }
  ipsec_config {
    ipsec_pfs      = "group2"
    ipsec_enc_alg  = "aes"
    ipsec_auth_alg = "sha1"
    ipsec_lifetime = 86400
  }
}

resource "alicloud_vpn_connection" "ipsec_connection_2" {
  name                = "IPSecConnection2"
  vpn_gateway_id      = alicloud_vpn_gateway.aws_vpn_gateway.id
  customer_gateway_id = alicloud_vpn_customer_gateway.aws_customer_gateway_2.id
  local_subnet        = [var.alicloud_vpc.cidr]
  remote_subnet       = [var.aws_vpc.cidr]
  effect_immediately  = true
  ike_config {
    ike_auth_alg  = "sha1"
    ike_enc_alg   = "aes"
    ike_version   = "ikev1"
    ike_mode      = "main"
    ike_lifetime  = 86400
    psk           = aws_vpn_connection.alicloud_vpn_connection.tunnel2_preshared_key
    ike_pfs       = "group2"
    ike_local_id = alicloud_vpn_gateway.aws_vpn_gateway.internet_ip
    ike_remote_id = aws_vpn_connection.alicloud_vpn_connection.tunnel2_address
  }
  ipsec_config {
    ipsec_pfs      = "group2"
    ipsec_enc_alg  = "aes"
    ipsec_auth_alg = "sha1"
    ipsec_lifetime = 86400
  }
}

Although, only a few of the above parameters are mandatory for making the request, have put in the exhaustive list just to give you guys an idea of what the parameters are.

VPN Connection Route Entry

resource "alicloud_vpn_route_entry" "alicloud_vpn_route_entry_1" {
  vpn_gateway_id = alicloud_vpn_gateway.aws_vpn_gateway.id
  route_dest     = var.aws_vpc.cidr
  next_hop       = alicloud_vpn_connection.ipsec_connection_1.id
  weight         = 0
  publish_vpc    = true
}

resource "alicloud_vpn_route_entry" "alicloud_vpn_route_entry_2" {
  vpn_gateway_id = alicloud_vpn_gateway.aws_vpn_gateway.id
  route_dest     = var.aws_vpc.cidr
  next_hop       = alicloud_vpn_connection.ipsec_connection_2.id
  weight         = 100
  publish_vpc    = true
}

The link to the entire code can be found at the tf-aws-alicloud-vpn repository on Github.

Happy Coding! Cheers :)