<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Ambar Mehrotra</title>
    <description>The latest articles on Forem by Ambar Mehrotra (@_notanengineer).</description>
    <link>https://forem.com/_notanengineer</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F163226%2Fbe7a4859-f0e2-4ea7-9e64-562539b6b252.jpeg</url>
      <title>Forem: Ambar Mehrotra</title>
      <link>https://forem.com/_notanengineer</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/_notanengineer"/>
    <language>en</language>
    <item>
      <title>Disaster Recovery - A practical guide (Part 1)</title>
      <dc:creator>Ambar Mehrotra</dc:creator>
      <pubDate>Mon, 17 Jan 2022 19:09:57 +0000</pubDate>
      <link>https://forem.com/_notanengineer/disaster-recovery-a-practical-guide-part-1-j6o</link>
      <guid>https://forem.com/_notanengineer/disaster-recovery-a-practical-guide-part-1-j6o</guid>
      <description>&lt;h1&gt;
  
  
  What is Disaster Recovery anyway?
&lt;/h1&gt;

&lt;p&gt;While dealing with Terrabytes of data every day, it is not uncommon for critical infrastructure components to run into situations that might cause data corruption and are not easy to recover from. There are many reasons or scenarios that can force an application to an inconsistent state. These might include but are not limited to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Natural disasters like hurricanes or earthquakes leading to the entire data centre going down&lt;/li&gt;
&lt;li&gt;A bug in the application code leading to incorrect or corrupted data&lt;/li&gt;
&lt;li&gt;Infrastructure failure due to power outages&lt;/li&gt;
&lt;li&gt;Cyber attacks leading to loss of data or partial data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These scenarios are commonly referred to as disasters, and the &lt;strong&gt;ability to recover from these disasters to a consistent state is called disaster recovery&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rKoQ35pK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642442438429/RHcZa2hcu.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rKoQ35pK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642442438429/RHcZa2hcu.gif" alt="chaos-office.gif" width="498" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this post, I am going to mostly talk about how we built disaster-recovery strategies for some common data systems like Aurora (MySQL), MongoDB, and elasticsearch. I will also talk about the challenges we faced, some common pitfalls and practical learnings that we got out of this project.&lt;/p&gt;

&lt;h1&gt;
  
  
  RTO/RPO
&lt;/h1&gt;

&lt;p&gt;Two of the most commonly talked about terms when talking about disaster recovery are RTO and RPO. Although these terms look fancy, they are very intuitive and easy to understand if you look at the problem in a practical way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RPO&lt;/strong&gt; (Recovery Point Objective) - This refers to the maximum amount of data loss you are ready to bear in case of a disaster. For example, if you are okay to bear a loss of 1 day of data, your RPO will be 24 hours and this will be the frequency at which you take regular backups. Although there are many solutions to take regular backups, doing this very frequently might lead to an increase in costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RTO&lt;/strong&gt; (Recovery Time Objective) - This refers to the maximum amount of time you are willing to spend in order to recover from a disaster. For example, if it takes me an hour to restore all the lost data, my RTO will be 1 hour. Generally speaking, more the amount of data, more time it will take to restore.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ill5ULTb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1640627616097/ApFWLBH2a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ill5ULTb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1640627616097/ApFWLBH2a.png" alt="DisasterRecovery.png" width="621" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Defining a strategy
&lt;/h1&gt;

&lt;p&gt;Most DR strategies for databases can be divided into 3 major steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Snapshot Creation&lt;/strong&gt;      - Refers to the ability to take snapshots at regular intervals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snapshot Retention&lt;/strong&gt;    - Refers to retaining snapshots in a particular window, while deleting everything else. The retention policy windows can generally be divided into &lt;strong&gt;Incremental&lt;/strong&gt; and &lt;strong&gt;Moving&lt;/strong&gt; windows. Examples for each can be found below:

&lt;ul&gt;
&lt;li&gt;Incremental Window&lt;/li&gt;
&lt;li&gt;Retain one snapshot for every month&lt;/li&gt;
&lt;li&gt;Retain one snapshot for every year&lt;/li&gt;
&lt;li&gt;Moving Window&lt;/li&gt;
&lt;li&gt;Retain one snapshot for each day for the last 15 days&lt;/li&gt;
&lt;li&gt;Retain one snapshot for each week for the last 4 weeks&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snapshot Restoration&lt;/strong&gt; - Ability to restore the database to a specific snapshot. Unlike backup retention and retention, backup restoration should not be automated but a manual step. This means that &lt;strong&gt;any kind of data restoration should originate from a clear user intent for such an activity&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's have a look at what things we should take into consideration while designing a DR strategy and the corresponding implementations. Some of the things that we should consider are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We should be able to rollout DR one instance at a time&lt;/li&gt;
&lt;li&gt;Rollout should be minimally invasive and should not cause any service disruptions unless absolutely necessary&lt;/li&gt;
&lt;li&gt;Taking regular snapshots should be automated, but restoration to a previous point in time snapshot should require manual intervention&lt;/li&gt;
&lt;li&gt;The user making the DR plan should be able to specify windows when backup should be taken (Backup process should not cause any disruption in service)&lt;/li&gt;
&lt;li&gt;The user should be able to specify windows for which snapshots should be retained&lt;/li&gt;
&lt;li&gt;Should have a good balance between RTO and RPO&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the above mentioned considerations in mind, we can go ahead and design a general purpose strategy that can be implemented across cloud providers in different ways&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creationStrategy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"triggerSchedule"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"0 1 * * ? *"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"0 2 * * ? *"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"retentionStrategy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"triggerSchedule"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"0 2 * * ? *"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"rollingWindow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hours"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"days"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"weeks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"incrementalWindow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"init"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1514764800"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"span"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"month"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"interval"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Architecture
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Cloud agnostic vs cloud specific
&lt;/h2&gt;

&lt;p&gt;Our CD platform is written in Terraform and works off of a state file present in an S3 bucket. Each execution of the CD pipeline takes the desired cluster state, compares it with the existing cluster state, and tries to move the cluster from current to desired state (like a control-loop). One of the core thoughts in our mind while building our CI/CD systems was that different teams can spawn their own instances of MySQL, MongoDB, etc., without needing to worry about the cloud provider their application is running on. For example, requesting for a MySQL instance on AWS would launch an Aurora instance, while requesting for the same MySQL instance on Alicloud would launch an ApsaraDB instance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KbkD-itr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642364202338/uN-YYIcTZ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KbkD-itr--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642364202338/uN-YYIcTZ.png" alt="c8c5a556-7630-4987-b39e-d8aad3d0d1d7.png" width="880" height="359"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--R6lTcGAC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642364210544/NRRM6cLXP.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--R6lTcGAC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642364210544/NRRM6cLXP.png" alt="169e4fcd-c65d-454d-99e6-4f9bcff9de8e.png" width="721" height="241"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Because of the above mentioned use-case, even our DR implementation had to be written in a way that it could be implemented differently for different cloud components across different cloud providers. The DR setup includes the following steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The CD pipeline creates the required crons to implement the above mentioned functionality according to the cloud specific implementation of the infrastructure component.&lt;/li&gt;
&lt;li&gt;The cronjob or the cloud function responsible for creating or removing snapshots will internally make an API call suitable to the underlying implementation of the database. For example, the underlying implementation for an SQL database can be different for different cloud providers -- Aurora for AWS, ApsaraDB for Alicloud, CosmosDB for Azure, etc.&lt;/li&gt;
&lt;li&gt;The implemented crons trigger at required intervals and create or delete snapshots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--e7Zo-hMA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642361736883/rRdVbMnQW.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--e7Zo-hMA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642361736883/rRdVbMnQW.png" alt="Disaster Recovery Demo.png" width="880" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Snapshot Creation and Retention
&lt;/h2&gt;

&lt;p&gt;For both snapshot creation and retention, we need the ability of running an automated job at regular intervals, which can take new snapshots or remove existing snapshots. The frequency at which this job runs can be defined in our DR strategy and its value can be decided based on our RTO and RPO objectives&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--CNy--E-I--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642352479561/FHFdnkL1O.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--CNy--E-I--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642352479561/FHFdnkL1O.png" alt="DR.png" width="709" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Snapshot Restoration
&lt;/h2&gt;

&lt;p&gt;As mentioned before, the restoration of a database from a snapshot should require a manual user intent and the API and UI for the same should be limited by proper access control. The restoration flow can look something like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User marks a snapshot as an active candidate for restoration via the API or UI&lt;/li&gt;
&lt;li&gt;The CI system should validate the user request and access level, and store the user intent in DB&lt;/li&gt;
&lt;li&gt;The user intent should be passed to the CD pipeline in the next run. This can work in a pull or a push model.&lt;/li&gt;
&lt;li&gt;The CD pipeline should infer that the desired state requires a database to be restored from a snapshot and take actions accordingly to restore data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--tFLS6bNa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642430524359/4UbxdCcLW.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tFLS6bNa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642430524359/4UbxdCcLW.png" alt="restore_snapshot_new.png" width="596" height="362"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the interest of not making this post too long, I will be describing the exact implementation we chose for MySQL, MongoDB, and Elasticsearch in the next part of this series. The next part will also consist of our learnings while implementing DR for each of these database types. Stay tuned and cheers :)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qckHygGq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642442792484/nwdrZVrta.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qckHygGq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://cdn.hashnode.com/res/hashnode/image/upload/v1642442792484/nwdrZVrta.gif" alt="cheers-jack-sparrow.gif" width="436" height="199"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>cloud</category>
      <category>programming</category>
      <category>database</category>
    </item>
    <item>
      <title>Elasticsearch Backup and Restore with AWS S3 in Kubernetes</title>
      <dc:creator>Ambar Mehrotra</dc:creator>
      <pubDate>Sat, 13 Jun 2020 18:36:25 +0000</pubDate>
      <link>https://forem.com/_notanengineer/elasticsearch-backup-and-restore-with-aws-s3-in-kubernetes-mof</link>
      <guid>https://forem.com/_notanengineer/elasticsearch-backup-and-restore-with-aws-s3-in-kubernetes-mof</guid>
      <description>&lt;p&gt;In my day job, I get a chance of working with things like Docker, Kubernetes, Terraform, and various cloud components across cloud providers. We have multiple Elasticsearch clusters running inside our Kubernetes cluster (EKS). These Elasticsearch clusters have been installed using the well-known package manager for Kubernetes -- &lt;a href="https://helm.sh/"&gt;Helm&lt;/a&gt; as Helm charts. Recently, I had to set up a disaster-recovery strategy for these Elasticsearch clusters to restore these clusters to a previous stable state in case of a failure.&lt;/p&gt;

&lt;p&gt;The process involved taking regular snapshots of the Elasticsearch cluster and backing them up in an &lt;strong&gt;S3 bucket&lt;/strong&gt;. These backups can later be used to restore the cluster state at a given point in time in case of a disaster. Although the process was not that complicated and was more or less documented, I still had to google some configuration options for it to get to work properly, so I thought of just mentioning the exact necessary steps in a small blog post.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;NOTE: If you are using Elasticsearch version 7.5 and above, Elasticsearch has a pretty great module called &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/7.x/getting-started-snapshot-lifecycle-management.html"&gt;Snapshot Lifecycle Management&lt;/a&gt; and I suggest you check that out.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The main idea behind the setup goes like the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Configure the S3 repository plugin for the Elasticsearch cluster&lt;/li&gt;
&lt;li&gt;Call the ES snapshot API at regular intervals to take incremental snapshots&lt;/li&gt;
&lt;li&gt;Use the restore API to restore the indexes or cluster state from these backups&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The steps for setting for achieving the above-mentioned goals can be divided into 3 main parts:&lt;/p&gt;

&lt;h2&gt;
  
  
  Enable the S3 repository plugin
&lt;/h2&gt;

&lt;p&gt;Enabling plugins in Elasticsearch requires a restart of the ES cluster. Therefore, the official documentation suggests creating a custom Docker image that installs the S3 plugin inside the docker image itself. According to the docs:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;There are a couple of reasons we recommend this.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tying the availability of Elasticsearch to the download service to install plugins is not a great idea or something that we recommend. Especially in Kubernetes where it is normal and expected for a container to be moved to another host at random times.&lt;/li&gt;
&lt;li&gt;Mutating the state of a running Docker image (by installing plugins) goes against best practices of containers and immutable infrastructure.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;So, to build a docker image with s3 repository plugin enabled, you can use the following Dockerfile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ARG elasticsearch_version
FROM docker.elastic.co/elasticsearch/elasticsearch:${elasticsearch_version}

RUN bin/elasticsearch-plugin install --batch repository-s3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enabling plugins in ES requires extra permissions, the &lt;code&gt;--batch&lt;/code&gt; flag tells ES to give any required permissions for the installation of the plugin without prompting for confirmation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configure Elasticsearch to use S3 bucket for storing snapshots
&lt;/h2&gt;

&lt;p&gt;There are many parameters you can adjust while registering an S3 bucket for storing Elasticsearch snapshots and for a complete set of features you can take a look at the &lt;a href="https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-s3-repository.html"&gt;official documentation&lt;/a&gt;. For a basic setup, you can register the S3 bucket by making a curl call to the repository endpoint of ES:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;PUT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;_snapshot/my_s&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="err"&gt;_repository&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"s3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"settings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bucket"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my_bucket_name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"another_setting"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"setting_value"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configure permissions that allow Elasticsearch pod to access the S3 bucket
&lt;/h2&gt;

&lt;p&gt;Thanks to amazing projects like &lt;a href="https://github.com/jtblin/kube2iam"&gt;kube2iam&lt;/a&gt; that help you easily provide required IAM access to individual Kubernetes objects, this job has become quite easy. The helm chart for Elasticsearch has the provision of taking &lt;code&gt;podAnnotations&lt;/code&gt; as an input. These annotations are applied to the Elasticsearch pods and can leverage the full functionality of kube2iam for accessing the S3 bucket.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;podAnnotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;  
  &lt;span class="na"&gt;iam.amazonaws.com/role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-iam-role"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;The corresponding IAM role can be easily generated using AWS clients like boto3 or AWS plugins in Terraform, or any other AWS client at your disposal.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Informing the Elasticsearch Helm chart about ES version
&lt;/h2&gt;

&lt;p&gt;This was one of the settings that were not mentioned in the plugins documentation in a straightforward manner and I had to search around a bit to figure this out. You need to set the &lt;code&gt;esMajorVersion&lt;/code&gt; flag as well in case you are using a custom image and not running the default Elasticsearch version. For example, I had to set &lt;code&gt;esMajorVersion: 6&lt;/code&gt; as I was running version 6.3.1 of Elasticsearch.&lt;br&gt;
You can have a look at the Elasticsearch &lt;a href="https://github.com/elastic/helm-charts/blob/master/elasticsearch/templates/statefulset.yaml#L250"&gt;statefulset&lt;/a&gt; for checking the exact usage of this flag.&lt;/p&gt;

&lt;p&gt;That's it, now we are ready to take Elasticsearch snapshots or restore from them.&lt;/p&gt;
&lt;h2&gt;
  
  
  Taking Snapshots
&lt;/h2&gt;

&lt;p&gt;This part is pretty straightforward. Elasticsearch provides a snapshot API which can be triggered to take backups of the entire cluster state or specific indexes.&lt;/p&gt;

&lt;p&gt;For snapshots of the entire cluster, you can use the following curl call&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;PUT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/_snapshot/my_backup/snapshot_&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="err"&gt;?wait_for_completion=&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also specify exact indexes that you want to take backup of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;PUT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/_snapshot/my_backup/snapshot_&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="err"&gt;?wait_for_completion=&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"indices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"index_1,index_2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ignore_unavailable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"include_global_state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"taken_by"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"kimchy"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"taken_because"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"backup before upgrading"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once a snapshot is created information about this snapshot can be obtained using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;GET&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/_snapshot/my_backup/snapshot_&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also, for automating the process of taking regular backups, you can use Kubernetes cronjobs for periodically making these API calls to the Elasticsearch backup endpoint.&lt;/p&gt;

&lt;h2&gt;
  
  
  Restoring from a snapshot
&lt;/h2&gt;

&lt;p&gt;The restore API is pretty simple as well. By default, all indices in the snapshot are restored, and the cluster state is not restored. You can make the following curl call for restoring from a snapshot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/_snapshot/my_backup/snapshot_&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="err"&gt;/_restore&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also provide index level information while restoring from a snapshot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;POST&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;/_snapshot/my_backup/snapshot_&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="err"&gt;/_restore&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"indices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"index_1,index_2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ignore_unavailable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"include_global_state"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;              
  &lt;/span&gt;&lt;span class="nl"&gt;"rename_pattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"index_(.+)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rename_replacement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"restored_index_$1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"include_aliases"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The restore operation can be performed on a functioning cluster. However, an existing index can be only restored if it’s &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/7.7/indices-open-close.html"&gt;closed&lt;/a&gt; and has the same number of shards as the index in the snapshot.&lt;/p&gt;

&lt;p&gt;That's All Folks!&lt;/p&gt;

&lt;p&gt;Happy Coding! Cheers :)&lt;/p&gt;

</description>
      <category>aws</category>
      <category>elasticsearch</category>
      <category>s3</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>An Introduction to Kubernetes Health Checks - Readiness Probe (Part II)</title>
      <dc:creator>Ambar Mehrotra</dc:creator>
      <pubDate>Sat, 06 Jun 2020 13:59:54 +0000</pubDate>
      <link>https://forem.com/_notanengineer/an-introduction-to-kubernetes-health-checks-readiness-probe-part-ii-4amj</link>
      <guid>https://forem.com/_notanengineer/an-introduction-to-kubernetes-health-checks-readiness-probe-part-ii-4amj</guid>
      <description>&lt;p&gt;It's been a long time since I wrote, and the post on Kubernetes Readiness probes has been long overdue. If you haven't checked out the first part of this post on &lt;a href="https://dev.to/_notanengineer/an-introduction-to-kubernetes-health-checks-liveness-probe-part-i-2elj"&gt;Kubernetes Liveness Probes&lt;/a&gt;, I suggest you to check that out. In this post, we will be looking mainly at the Readiness Probe and how it can be used to monitor the health of your applications. &lt;/p&gt;

&lt;p&gt;As discussed earlier, Kubernetes provides &lt;strong&gt;3 different kinds of health checks&lt;/strong&gt; to monitor the state of your applications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Liveness Probe&lt;/li&gt;
&lt;li&gt;Readiness Probe&lt;/li&gt;
&lt;li&gt;Startup Probe&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you are working with cloud applications, you might come across scenarios when one or more instances of your application might not be ready to serve any requests. In such scenarios, you would preferably not want the traffic to be routed to those instances. Some of these scenarios include but are not limited to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One of your application instances might be performing a batch operation periodically -- like reading a large SQL table and writing the results to S3&lt;/li&gt;
&lt;li&gt;Your application instances might be loading data from a DB to a cache on startup, and you do not want them to serve any traffic until the cache is populated&lt;/li&gt;
&lt;li&gt;You might not want your application to serve any traffic if some of the dependent services are down -- for example, if you have an image processing service that works off of files in Amazon S3, you might want to stop directing any traffic to your image processing service if S3 itself is down.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: In the above scenario, it is advisable to configure your readiness probe in a way that it is able to differentiate between the dependent service being unavailable vs having latency issues. For example, you would not want your service to stop serving requests if S3 has an increased latency of 100ms.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;In most of the scenarios mentioned above, you don’t want to kill the application, but you don’t want to send it requests either. Kubernetes provides Readiness probes to detect and mitigate these situations&lt;/strong&gt;. Readiness probes can be used by your application to tell Kubernetes that it is not ready to accept any traffic at the moment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/8cuVdoyDlfRnPFYMcv/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/8cuVdoyDlfRnPFYMcv/giphy.gif" alt="Not Ready" width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;According to the Kbernetes Documentation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What this essentially means is that &lt;strong&gt;when the Readiness probe fails for a particular pod of your application, Kubernetes removes that pod from the service mapping and stops forwarding any traffic to it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/dWCimzZf4IbSWaPIZA/source.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/dWCimzZf4IbSWaPIZA/source.gif" alt="Readiness Prob" width="1010" height="862"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Anatomy of a Readiness Probe
&lt;/h1&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;readiness&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;readiness-exec&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;readiness&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8s.gcr.io/busybox&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/bin/sh&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;-c&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;
    &lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;exec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;cat&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/tmp/healthy&lt;/span&gt;
      &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you look at the &lt;code&gt;readinessProbe&lt;/code&gt; section of the yaml, you can see that the kubelet performs a &lt;code&gt;cat&lt;/code&gt; operation on the &lt;code&gt;/tmp/healthy&lt;/code&gt; file. If the file is present and the cat operation is successful, the command returns with exit status 0, and the kubelet considers the container to be available and ready to accept traffic. On the other hand, if the command returns with a non zero exit status, kubelet removes the container from the Service/LoadBalancer until the readiness probe succeeds again. No traffic is forwarded to this container until it starts returning success again.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;initialDelaySeconds&lt;/code&gt; parameter tells the kubelet that it should wait for 5 seconds before performing the first readiness check. This ensures that the container is not considered to be in an unavailable state when it is booting up. After the initial delay, the kubelet performs the readiness check every 5 seconds as defined by the &lt;code&gt;periodSeconds&lt;/code&gt; field.&lt;/p&gt;

&lt;p&gt;Apart from generic commands, a Readiness probe can also be defined over TCP and HTTP endpoints which are specially helpful if you are developing web applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  TCP readiness probe
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;goproxy&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;goproxy&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;goproxy&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8s.gcr.io/goproxy:0.1&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
    &lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;tcpSocket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
      &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;15&lt;/span&gt;
      &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This kind of readiness probe is basically a port check. If you want to check if a particular port on your web application is responsive or not, this is the way to go.&lt;/p&gt;

&lt;h2&gt;
  
  
  HTTP readiness probe
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;readiness&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;readiness-http&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;readiness&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8s.gcr.io/readiness&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/server&lt;/span&gt;
    &lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/readiness&lt;/span&gt;
        &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
        &lt;span class="na"&gt;httpHeaders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Custom-Header&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Awesome&lt;/span&gt;
      &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
      &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For an HTTP readiness probe, kubelet polls the endpoint of the container as defined by the path and port parameters in the yaml. If the endpoint returns a success status code, the container is considered healthy.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;In this post we looked at certain scenarios where you might not want an instance of your application to be available to serve requests, and how Kubernetes Liveness Probe helps you identify and mitigate such scenarios effectively. Stay healthy and stay tuned.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/8lMQKIZIXiOn0VVs3A/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/8lMQKIZIXiOn0VVs3A/giphy.gif" alt="Gotta stay healthy" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Happy Coding! Cheers :)&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>docker</category>
      <category>aws</category>
      <category>devops</category>
    </item>
    <item>
      <title>An Introduction to Kubernetes Health Checks - Liveness Probe (Part I)</title>
      <dc:creator>Ambar Mehrotra</dc:creator>
      <pubDate>Sun, 19 Apr 2020 17:43:26 +0000</pubDate>
      <link>https://forem.com/_notanengineer/an-introduction-to-kubernetes-health-checks-liveness-probe-part-i-2elj</link>
      <guid>https://forem.com/_notanengineer/an-introduction-to-kubernetes-health-checks-liveness-probe-part-i-2elj</guid>
      <description>&lt;p&gt;This post was originally published on my blog: &lt;a href="https://ambar.dev/kubernetes-liveness-probe.html"&gt;https://ambar.dev/kubernetes-liveness-probe.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It was not very long ago when we were deploying individual services on each Virtual Machine. This process required the engineer in charge of the deployment process to be aware of all the machines where each service was deployed. Sure, people had built great solutions around this deployment model like tagging their EC2 machines with special names and using automation tools like Rundeck, Jenkins, etc., to automate the deployment process. Although this process had matured to a great extent over several years, it still had its shortcomings like -- &lt;em&gt;random application crashes, inefficient deployment practices, poor resilience to failures, improper resource utilization, and bad practices around secret and configuration management&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The rise of Docker and Kubernetes
&lt;/h2&gt;

&lt;p&gt;In order to solve the above-mentioned problems, people started building solutions around container environments like Docker and Kubernetes which not only solved the above-mentioned problems but also provided other benefits. One of the major benefits of using a platform like &lt;strong&gt;Kubernetes&lt;/strong&gt; is that it provides &lt;strong&gt;self-healing&lt;/strong&gt; capabilities to your application. According to the Kubernetes documentation, self-healing can be defined as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Kubernetes restarts containers that fail, replaces containers, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What this basically means is, if your application for some reason goes into a state where it cannot perform it's desired function, Kubernetes will try to replace the crashing instance with a new one until it succeeds. Well, how does Kubernetes know that a pod (&lt;em&gt;A Pod is the basic execution unit of a Kubernetes application&lt;/em&gt;) is not in a healthy state, or whether it is ready to handle any extra workload at the moment? Kubernetes solves this problem with the help of &lt;strong&gt;health checks&lt;/strong&gt;. Kubernetes has 2 types of health checks that it uses to determine the health of a running pod -- Liveness Probe and Readiness Probe. In this first part, we will take a look at how the liveness probe works and how we can use it to keep our applications healthy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Liveness Probe
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/SYRBDJ0Pj3pSxx6Lft/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/SYRBDJ0Pj3pSxx6Lft/giphy.gif" alt="Kubernetes Liveness Probe" width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Developers and engineers often make mistakes. Sometimes, these mistakes don't get caught in our nightly or staging environments and might spill over to production. Often, these mistakes result in applications that get stuck in tricky situations and hence cannot perform their designated operations as usual. Sometimes, these corner cases can cause the application to crash during the most unexpected of circumstances when it is not possible for an engineer to take a look and correct it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/u5Pxn776rafRe/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/u5Pxn776rafRe/giphy.gif" alt="Unexpected Circumstances" width="433" height="221"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Some of the corner cases might include the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An application not responding because of a deadlock&lt;/li&gt;
&lt;li&gt;Null Pointer Exceptions causing the application to crash&lt;/li&gt;
&lt;li&gt;Out of Memory (OOM) errors causing the application to crash&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Often, applications stuck in these states need a restart to start functioning correctly again&lt;/strong&gt;. The &lt;a href="https://kubernetes.io/docs/admin/kubelet/"&gt;kubelet&lt;/a&gt; uses &lt;strong&gt;liveness probes&lt;/strong&gt; to check if the application is alive and behaving correctly to know when to restart a container. Let us look at an example to see what parameters are involved in a liveness probe.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;liveness&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;liveness-exec&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;liveness&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8s.gcr.io/busybox&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/bin/sh&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;-c&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;
    &lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;exec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;cat&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/tmp/healthy&lt;/span&gt;
      &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you look at the &lt;code&gt;livenessProbe&lt;/code&gt; section of the yaml, you can see that the &lt;em&gt;kubelet&lt;/em&gt; performs a &lt;code&gt;cat&lt;/code&gt; operation on the &lt;code&gt;/tmp/healthy&lt;/code&gt; file. If the file is present and the cat operation is successful, the command returns with &lt;em&gt;exit status 0&lt;/em&gt;, and the kubelet considers the container to be in healthy state. On the other hand, if the command returns with a &lt;em&gt;non zero exit status&lt;/em&gt;, kubelet kills the container and restarts it. &lt;/p&gt;

&lt;p&gt;The &lt;code&gt;initialDelaySeconds&lt;/code&gt; parameter tells the &lt;em&gt;kubelet&lt;/em&gt; that it should wait for 5 seconds before performing the first liveness check. This ensures that the container is not considered to be in a crashing state when it is booting up. After the initial delay, the &lt;em&gt;kubelet&lt;/em&gt; performs the liveness check every 5 seconds as defined by the &lt;code&gt;periodSeconds&lt;/code&gt; field.&lt;/p&gt;

&lt;p&gt;When the container starts, it executes the command &lt;code&gt;touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600&lt;/code&gt; that can be divided into the following parts which are performed in the mentioned order:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create the file &lt;code&gt;/tmp/healthy&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Go to sleep for 30s&lt;/li&gt;
&lt;li&gt;Delete the earlier created file &lt;code&gt;/tmp/healthy&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Go to sleep for 600s&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After the file &lt;code&gt;/tmp/healthy&lt;/code&gt; is deleted, the liveness probe will start failing and the liveness probe will start returning an error code back to the &lt;em&gt;kubelet&lt;/em&gt;. On detecting the failure, the &lt;em&gt;kubelet&lt;/em&gt; will kill the existing container and replace it with a new one. The &lt;em&gt;kubelet&lt;/em&gt; will keep doing this unless the liveness probe succeeds. You can run the command &lt;code&gt;kubectl describe po liveness-exec&lt;/code&gt; to view the pod events.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--eivH6zB0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://imgur.com/PmEXLS0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--eivH6zB0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://imgur.com/PmEXLS0.png" alt="Liveness Probe Pod Status" width="880" height="174"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see, when the &lt;em&gt;kubelet&lt;/em&gt; found the pod to be unhealthy 3 consecutive times over a period of 14 seconds, it marked the pod as &lt;strong&gt;unhealthy&lt;/strong&gt; and went ahead to restart it. Apart from generic commands, a Liveness probe can also be defined over &lt;code&gt;TCP&lt;/code&gt; and &lt;code&gt;HTTP&lt;/code&gt; endpoints which are especially helpful if you are developing web applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  TCP liveness probe
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;goproxy&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;goproxy&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;goproxy&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8s.gcr.io/goproxy:0.1&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
    &lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;tcpSocket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
      &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;15&lt;/span&gt;
      &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This kind of liveness probe is basically a port check. If you want to check if a particular port on your web application is responsive or not, this is the way to go.&lt;/p&gt;

&lt;h3&gt;
  
  
  HTTP liveness probe
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;liveness&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;liveness-http&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;liveness&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8s.gcr.io/liveness&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/server&lt;/span&gt;
    &lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/healthz&lt;/span&gt;
        &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
        &lt;span class="na"&gt;httpHeaders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Custom-Header&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Awesome&lt;/span&gt;
      &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
      &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For an HTTP liveness probe, kubelet polls the endpoint of the container as defined by the &lt;code&gt;path&lt;/code&gt; and &lt;code&gt;port&lt;/code&gt; parameters in the yaml. If the endpoint returns a success status code, the container is considered healthy.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this post we saw what were the problems with the traditional approach to deploying and monitoring applications, what are the solutions that Docker and Kubernetes provide for handling the issues, and how the Liveness Probe helps resolve these issues. In the next post, we will take a look at the other kind of Kubernetes Health Check -- Readiness Probe. Stay healthy and stay tuned.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/8lMQKIZIXiOn0VVs3A/giphy.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/8lMQKIZIXiOn0VVs3A/giphy.gif" alt="Healthy" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Happy Coding! Cheers :)&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>docker</category>
      <category>devops</category>
      <category>aws</category>
    </item>
    <item>
      <title>Installing Elasticsearch inside a Kubernetes cluster with Helm and Terraform</title>
      <dc:creator>Ambar Mehrotra</dc:creator>
      <pubDate>Fri, 03 Apr 2020 14:13:36 +0000</pubDate>
      <link>https://forem.com/_notanengineer/installing-elasticsearch-inside-a-kubernetes-cluster-with-helm-and-terraform-40jf</link>
      <guid>https://forem.com/_notanengineer/installing-elasticsearch-inside-a-kubernetes-cluster-with-helm-and-terraform-40jf</guid>
      <description>&lt;p&gt;This post was originally published on my blog: &lt;a href="https://ambar.dev/tf-helm-kubernetes-elasticsearch-setup.html"&gt;Installing Elasticsearch inside a Kubernetes cluster with Helm and Terraform&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Github Repository: &lt;a href="https://github.com/coder006/tf-helm-kubernetes-elasticsearch.git"&gt;tf-helm-kubernetes-elasticsearch&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;Note&lt;/em&gt;&lt;/strong&gt;:&lt;br&gt;
This guide uses Terraform for making API calls and state management. If you have helm installed on your machine, you can use that instead for installing the chart.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What is Elasticsearch?
&lt;/h2&gt;

&lt;p&gt;According to the Elasticsearch website:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Elasticsearch is generally used as the underlying engine for platforms that perform complex text search, logging, or real-time advanced analytics operations. The ELK stack (Elasticsearch, Logstash, and Kibana) has also become the de facto standard when it comes to logging and it's visualization in container environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;Before we move forward, let us take a look at the basic architecture of Elasticsearch:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--pkQ3ztaH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/images/za-2-az.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--pkQ3ztaH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/images/za-2-az.png" alt="Elasticsearch Nodes" title="Elasticsearch Cluster" width="880" height="494"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The above is an overview of a basic &lt;strong&gt;Elasticsearch Cluster&lt;/strong&gt;. As you can see, the cluster is divided into several nodes. A &lt;strong&gt;node&lt;/strong&gt; is a server (physical or virtual) that stores some data and is a part of the elasticsearch cluster. A &lt;strong&gt;cluster&lt;/strong&gt;, on the other hand, is a collection of several nodes that together form the cluster. Every node in turn can hold multiple shards from one or multiple indices. Different kinds of nodes available in Elasticsearch are &lt;em&gt;Master-eligible node&lt;/em&gt;, &lt;em&gt;Data node&lt;/em&gt;, &lt;em&gt;Ingest node&lt;/em&gt;, and &lt;em&gt;Machine learning node&lt;/em&gt;(Not availble in the OSS version). In this article, we will only be looking at the master and data nodes for the sake of simplicity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Master-eligible node
&lt;/h3&gt;

&lt;p&gt;A node that has &lt;code&gt;node.master&lt;/code&gt; flag set to &lt;code&gt;true&lt;/code&gt;, which makes it eligible to be elected as the &lt;em&gt;master node&lt;/em&gt; which controls the cluster. One of the &lt;em&gt;master-eligible&lt;/em&gt; nodes is elected as the &lt;strong&gt;Master&lt;/strong&gt; via the &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html"&gt;master election process&lt;/a&gt;. Following are few of the functions performed by the &lt;em&gt;master node&lt;/em&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creating or deleting an index&lt;/li&gt;
&lt;li&gt;Tracking which nodes are part of the cluster&lt;/li&gt;
&lt;li&gt;Deciding which shards to allocate to which nodes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Data node
&lt;/h3&gt;

&lt;p&gt;A node that has &lt;code&gt;node.data&lt;/code&gt; flag set to &lt;code&gt;true&lt;/code&gt;. Data nodes hold the shards that contain the documents you have indexed. These nodes perform several operations that are IO, memory, and CPU extensive in nature. Some of the functions performed by &lt;em&gt;data nodes&lt;/em&gt; are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data related operations like CRUD&lt;/li&gt;
&lt;li&gt;Search&lt;/li&gt;
&lt;li&gt;Aggregations&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Terminology
&lt;/h2&gt;

&lt;p&gt;Now that we have a basic idea about the Elasticsearch Architecture, let us see how to Elasticsearch inside a Kubernetes Cluster using Helm and Terraform. Before moving forward, let us go through some basic terminology.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kubernetes&lt;/strong&gt;: Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Helm&lt;/strong&gt;: Helm is an application package manager running atop Kubernetes. It allows describing the application structure through convenient helm-charts and managing it with simple commands&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Terraform&lt;/strong&gt;: Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing and popular service providers as well as custom in-house solutions&lt;/p&gt;




&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;First, let us describe the variables and the default values needed for setting up the Elasticsearch Cluster:&lt;/p&gt;

&lt;h3&gt;
  
  
  Default Values:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="n"&gt;master_node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="n"&gt;volume_size&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;number&lt;/span&gt;
      &lt;span class="n"&gt;cpu&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;number&lt;/span&gt;
      &lt;span class="n"&gt;memory&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;number&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="n"&gt;data_node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="n"&gt;volume_size&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;number&lt;/span&gt;
      &lt;span class="n"&gt;cpu&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;number&lt;/span&gt;
      &lt;span class="n"&gt;memory&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;number&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;master_node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;volume_size&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
      &lt;span class="n"&gt;cpu&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
      &lt;span class="n"&gt;memory&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;data_node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;volume_size&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
      &lt;span class="n"&gt;cpu&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
      &lt;span class="n"&gt;memory&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="s"&gt;"kubeconfig_file_path"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;type&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
  &lt;span class="k"&gt;default&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/my/file/path"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;For the sake of simplicity, I will assume that you have a working helm installtion. Although, you can still go over to the &lt;a href="https://github.com/coder006/tf-helm-kubernetes-elasticsearch.git"&gt;Github Repository&lt;/a&gt; to take a look at how to install helm and tiller onto your Kubernetes cluster using Terraform.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Terraform Helm Setup
&lt;/h3&gt;

&lt;p&gt;This step involves declaring a helm provider and the elasticsearch helm repository to pull the helm chart from&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="s"&gt;"helm"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;kubernetes&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;config_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kubeconfig_file_path&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"~&amp;gt; 0.10.4"&lt;/span&gt;
  &lt;span class="n"&gt;service_account&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kubernetes_service_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tiller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
  &lt;span class="n"&gt;install_tiller&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="s"&gt;"helm_repository"&lt;/span&gt; &lt;span class="s"&gt;"stable"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"elastic"&lt;/span&gt;
  &lt;span class="n"&gt;url&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://helm.elastic.co"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Setting up Master Eligible and Data nodes
&lt;/h3&gt;

&lt;p&gt;Let us take a look at some of the important fields used in the following helm release resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;clusterName&lt;/code&gt; - This refers to the name of the elasticsearch cluster and has the default value of &lt;code&gt;elasticsearch&lt;/code&gt;. Because elasticsearch looks at the cluster name when joining a new node, it is better to set the value of this field to something else.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;nodeGroup&lt;/code&gt; - This tells the elasticsearch helm chart whether the node is a master eligible node or a data node&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;storageClassName&lt;/code&gt; - The kubernetes storage class that you want to use for provisioning the attached volumes. You can skip this field if your cloud provider has a default storageclass object defined&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cpu&lt;/code&gt;: The number of CPU cores you want to give to the elasticsearch pod&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;memory&lt;/code&gt;: The amount of memory you want to allocate to the elasticsearch pod&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Master Eligible Nodes
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="n"&gt;helm_release&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch_master"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch-master"&lt;/span&gt;
  &lt;span class="n"&gt;repository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;helm_repository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
  &lt;span class="n"&gt;chart&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch"&lt;/span&gt;
  &lt;span class="n"&gt;version&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"7.6.1"&lt;/span&gt;
  &lt;span class="n"&gt;timeout&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;900&lt;/span&gt;

  &lt;span class="n"&gt;values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;RAW_VALUES&lt;/span&gt;
&lt;span class="nl"&gt;volumeClaimTemplate:&lt;/span&gt;
  &lt;span class="nl"&gt;accessModes:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="s"&gt;"ReadWriteOnce"&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nl"&gt;storageClassName:&lt;/span&gt; &lt;span class="s"&gt;"my-storage-class"&lt;/span&gt;
  &lt;span class="nl"&gt;resources:&lt;/span&gt;
    &lt;span class="nl"&gt;requests:&lt;/span&gt;
      &lt;span class="nl"&gt;storage:&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elasticsearch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;master_node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;volume_size&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="n"&gt;Gi&lt;/span&gt;
&lt;span class="nl"&gt;resources:&lt;/span&gt;
  &lt;span class="nl"&gt;requests:&lt;/span&gt;
    &lt;span class="nl"&gt;cpu:&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elasticsearch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;master_node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nl"&gt;memory:&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elasticsearch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data_node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="n"&gt;Gi&lt;/span&gt;
&lt;span class="nl"&gt;roles:&lt;/span&gt;
  &lt;span class="nl"&gt;master:&lt;/span&gt; &lt;span class="s"&gt;"true"&lt;/span&gt;
  &lt;span class="nl"&gt;ingest:&lt;/span&gt; &lt;span class="s"&gt;"false"&lt;/span&gt;
  &lt;span class="nl"&gt;data:&lt;/span&gt; &lt;span class="s"&gt;"false"&lt;/span&gt;
&lt;span class="n"&gt;RAW_VALUES&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"imageTag"&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"7.6.2"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"clusterName"&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch-cluster"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"nodeGroup"&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"master"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Data Nodes
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="n"&gt;helm_release&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch_data"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch-data"&lt;/span&gt;
  &lt;span class="n"&gt;repository&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;helm_repository&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stable&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
  &lt;span class="n"&gt;chart&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch"&lt;/span&gt;
  &lt;span class="n"&gt;version&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"7.6.1"&lt;/span&gt;
  &lt;span class="n"&gt;timeout&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;900&lt;/span&gt;

  &lt;span class="n"&gt;values&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;RAW_VALUES&lt;/span&gt;
&lt;span class="nl"&gt;volumeClaimTemplate:&lt;/span&gt;
  &lt;span class="nl"&gt;accessModes:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="s"&gt;"ReadWriteOnce"&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nl"&gt;storageClassName:&lt;/span&gt; &lt;span class="s"&gt;"my-storage-class"&lt;/span&gt;
  &lt;span class="nl"&gt;resources:&lt;/span&gt;
    &lt;span class="nl"&gt;requests:&lt;/span&gt;
      &lt;span class="nl"&gt;storage:&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elasticsearch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data_node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;volume_size&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="n"&gt;Gi&lt;/span&gt;
&lt;span class="nl"&gt;resources:&lt;/span&gt;
  &lt;span class="nl"&gt;requests:&lt;/span&gt;
    &lt;span class="nl"&gt;cpu:&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elasticsearch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data_node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nl"&gt;memory:&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elasticsearch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data_node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="n"&gt;Gi&lt;/span&gt;
&lt;span class="nl"&gt;roles:&lt;/span&gt;
  &lt;span class="nl"&gt;master:&lt;/span&gt; &lt;span class="s"&gt;"false"&lt;/span&gt;
  &lt;span class="nl"&gt;ingest:&lt;/span&gt; &lt;span class="s"&gt;"true"&lt;/span&gt;
  &lt;span class="nl"&gt;data:&lt;/span&gt; &lt;span class="s"&gt;"true"&lt;/span&gt;
&lt;span class="n"&gt;RAW_VALUES&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"imageTag"&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"7.6.2"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"clusterName"&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"elasticsearch-cluster"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;set&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"nodeGroup"&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"data"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Happy Coding! Cheers :)&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>terraform</category>
      <category>elasticsearch</category>
      <category>devops</category>
    </item>
    <item>
      <title>Setting up a VPN connection between AWS and Alicloud using Terraform</title>
      <dc:creator>Ambar Mehrotra</dc:creator>
      <pubDate>Sat, 28 Mar 2020 18:15:06 +0000</pubDate>
      <link>https://forem.com/_notanengineer/setting-up-a-vpn-connection-between-aws-and-alicloud-using-terraform-1pgf</link>
      <guid>https://forem.com/_notanengineer/setting-up-a-vpn-connection-between-aws-and-alicloud-using-terraform-1pgf</guid>
      <description>&lt;p&gt;Github Repository: &lt;a href="https://github.com/coder006/tf-aws-alicloud-vpn.git"&gt;tf-aws-alicloud-vpn&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;em&gt;Note&lt;/em&gt;&lt;/strong&gt;:&lt;br&gt;
This is not a guide on the internals of a Virtual Private Network. Rather, this post outlines how to setup a VPN connection between AWS and Alicloud. This guide uses Terraform for making API calls and state management. You can chose to use any HTTP client or aws and alicloud CLIs as well for making the same API calls and end up with a working VPN connection.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Problem Statement
&lt;/h2&gt;

&lt;p&gt;When you are working in a multicloud environment, many scenarios involve establishing a communication channel between services and resources that lie across cloud providers. For example, you might have a common &lt;strong&gt;Rundeck&lt;/strong&gt; machine that deployes the build binaries onto virtual machines residing in AWS as well as Azure. Another example might be a script in your CI/CD platform that interacts periodically with resources across cloud providers like &lt;strong&gt;RDS&lt;/strong&gt;, &lt;strong&gt;Mongo&lt;/strong&gt;, &lt;strong&gt;RabbitMQ&lt;/strong&gt;, etc., for regularly monitoring or updating different ACL Policies.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--K3bfEON8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://www.simform.com/wp-content/uploads/2017/11/Blog-Diagram1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--K3bfEON8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://www.simform.com/wp-content/uploads/2017/11/Blog-Diagram1.png" alt="Multi Cloud Architecture" title="Multi Cloud Architecture" width="880" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Creating a VPN connection helps you securely access resources on one cloud provider from another over an encrypted connection. A VPN connection helps you avoid the hassle of exposing public endpoints for each resource and then securing it. You can simply go ahead and whietelist a CIDR block across the VPCs and all your traffic in the given CIDR range will then be routed over this secure, encrypted connection.&lt;/p&gt;

&lt;h2&gt;
  
  
  VPN Setup
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LgG9btNz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://i.imgur.com/x3i3MC7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LgG9btNz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://i.imgur.com/x3i3MC7.png" alt="Aws Alicloud VPN Architecture" width="600" height="331"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Setting up a VPN connection mainly involves setting up the following components in both AWS and Alicloud:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;VPN Gateway&lt;/li&gt;
&lt;li&gt;Customer Gateway&lt;/li&gt;
&lt;li&gt;VPN Connection&lt;/li&gt;
&lt;li&gt;Connection Route&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;First and foremost, following are the cluster specific variables that we will need for AWS and Alicloud:&lt;/p&gt;

&lt;h3&gt;
  
  
  Variables and Cluster Definition
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cp"&gt;# Default region: Singapore
&lt;/span&gt;&lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="s"&gt;"aws_vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="n"&gt;region&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;profile&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;vpc_id&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;cidr&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;subnet_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;region&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ap-southeast-1"&lt;/span&gt;
    &lt;span class="n"&gt;profile&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"aws-profile"&lt;/span&gt;
    &lt;span class="n"&gt;vpc_id&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"123456789"&lt;/span&gt;
    &lt;span class="n"&gt;cidr&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"172.10.0.0/16"&lt;/span&gt;
    &lt;span class="n"&gt;subnet_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"subnet-123"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cp"&gt;# Default region: Singapore
# vswitch: AWS subnet equivalent in Alicloud
&lt;/span&gt;&lt;span class="n"&gt;variable&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="n"&gt;region&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;profile&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;vpc_id&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;cidr&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;vswitch_id&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;region&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ap-southeast-1"&lt;/span&gt;
    &lt;span class="n"&gt;profile&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"alicloud-profile"&lt;/span&gt;
    &lt;span class="n"&gt;vpc_id&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"987654321"&lt;/span&gt;
    &lt;span class="n"&gt;cidr&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"172.20.0.0/16"&lt;/span&gt;
    &lt;span class="n"&gt;vswitch_id&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"vswitch-123"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Terraform Providers for AWS and Alicloud
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="s"&gt;"aws"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;region&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;
  &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"~&amp;gt; 2.45.0"&lt;/span&gt;
  &lt;span class="n"&gt;profile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="s"&gt;"alicloud"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;region&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;
  &lt;span class="n"&gt;version&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1.71.1"&lt;/span&gt;
  &lt;span class="n"&gt;profile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first step is creating VPN Gateways in both Alicloud and AWS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_gateway"&lt;/span&gt; &lt;span class="s"&gt;"aws_vpn_gateway"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;                 &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"AWS-VPN-Gateway"&lt;/span&gt;
  &lt;span class="n"&gt;vpc_id&lt;/span&gt;               &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vpc_id&lt;/span&gt;
  &lt;span class="n"&gt;bandwidth&lt;/span&gt;            &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"10"&lt;/span&gt;
  &lt;span class="n"&gt;enable_ssl&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;
  &lt;span class="n"&gt;instance_charge_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"PostPaid"&lt;/span&gt;
  &lt;span class="n"&gt;description&lt;/span&gt;          &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"AWS-VPN-Gateway"&lt;/span&gt;
  &lt;span class="n"&gt;vswitch_id&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vswitch_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"aws_vpn_gateway"&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_gateway"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;vpc_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vpc_id&lt;/span&gt;

  &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Alicloud-VPN-GW"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  VPN Setup in AWS
&lt;/h2&gt;

&lt;p&gt;Creating the VPN Gateway will give us a publically accessible IP address of that gateway. In the first step, we will use the IP address of the Alicloud VPN Gateway to setup AWS side of things. Later on, we will repeat the same process in for Alicloud as well.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Customer Gateway
&lt;/h3&gt;

&lt;p&gt;According to AWS:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A customer gateway is a resource in AWS that provides information to AWS about your Customer Gateway Device&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A Customer Gateway basically lets AWS know about the remote/destination address where the traffic should be forwarded if the destination IP belongs to the Alicloud CIDR range&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"aws_customer_gateway"&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_gw"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;bgp_asn&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;65000&lt;/span&gt;
  &lt;span class="n"&gt;ip_address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;internet_ip&lt;/span&gt;
  &lt;span class="n"&gt;type&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ipsec.1"&lt;/span&gt;

  &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"alicloud-customer-gateway"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  VPN Connection
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://i.giphy.com/media/njYrp176NQsHS/giphy-downsized-large.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://i.giphy.com/media/njYrp176NQsHS/giphy-downsized-large.gif" alt="You Shall Not Pass" width="480" height="200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A VPN Connection resource in AWS creates 2 &lt;em&gt;Tunnels&lt;/em&gt; between your VPC and the remote network (Alicloud Network represented by &lt;code&gt;customer_gateway_id&lt;/code&gt; in this case). AWS will create 2 tunnels for redundancy. In case one of the tunnels goes down, the traffic is automatically routed through the other tunnel&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"aws_vpn_connection"&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_connection"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;vpn_gateway_id&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;customer_gateway_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_customer_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_gw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;type&lt;/span&gt;                &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ipsec.1"&lt;/span&gt;
  &lt;span class="n"&gt;static_routes_only&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  VPN Connection Route Entry
&lt;/h3&gt;

&lt;p&gt;This entry tells the VPN connection created in the previous step about the CIDR range of the destination&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"aws_vpn_connection_route"&lt;/span&gt; &lt;span class="s"&gt;"alicloud"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;destination_cidr_block&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cidr&lt;/span&gt;
  &lt;span class="n"&gt;vpn_connection_id&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AWS Route Table Modification
&lt;/h3&gt;

&lt;p&gt;Next we need to fetch the route table of the private subnet and modify the route table to tell AWS to forward all the traffic ,belonging to the CIDR range of the destination, to the VPN Gateway that we created above&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="s"&gt;"aws_route_table"&lt;/span&gt; &lt;span class="s"&gt;"aws_private_subnet_rt"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;subnet_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subnet_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"aws_route"&lt;/span&gt; &lt;span class="s"&gt;"r"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;route_table_id&lt;/span&gt;            &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_route_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_private_subnet_rt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;destination_cidr_block&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cidr&lt;/span&gt;
  &lt;span class="n"&gt;gateway_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the AWS setup is done, we are going to repeat the same steps for Alicloud as well. I am not going to explain the terminologies again for Alicloud as they are more or less the same.&lt;/p&gt;

&lt;h2&gt;
  
  
  VPN Setup in Alicloud
&lt;/h2&gt;

&lt;p&gt;First of all, we will create 2 customer gateways in Alicloud - one for each of the &lt;em&gt;Tunnels&lt;/em&gt; created by the &lt;em&gt;VPN Connection&lt;/em&gt; in AWS. The &lt;code&gt;ip_address&lt;/code&gt; parameter will contain the IP address of each of the tunnels&lt;/p&gt;

&lt;h3&gt;
  
  
  Customer Gateway
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_customer_gateway"&lt;/span&gt; &lt;span class="s"&gt;"aws_customer_gateway_1"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"AWSCustomerGateway1"&lt;/span&gt;
  &lt;span class="n"&gt;ip_address&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tunnel1_address&lt;/span&gt;
  &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"AWSCustomerGateway1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_customer_gateway"&lt;/span&gt; &lt;span class="s"&gt;"aws_customer_gateway_2"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"AWSCustomerGateway2"&lt;/span&gt;
  &lt;span class="n"&gt;ip_address&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tunnel2_address&lt;/span&gt;
  &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"AWSCustomerGateway2"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  VPN Connection
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cp"&gt;# `effect_immediately` parameter determines weather to delete a successfully negotiated IPsec tunnel and initiate a negotiation again
&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_connection"&lt;/span&gt; &lt;span class="s"&gt;"ipsec_connection_1"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;                &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"IPSecConnection1"&lt;/span&gt;
  &lt;span class="n"&gt;vpn_gateway_id&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;customer_gateway_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_customer_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_customer_gateway_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;local_subnet&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cidr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="n"&gt;remote_subnet&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cidr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="n"&gt;effect_immediately&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;
  &lt;span class="n"&gt;ike_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ike_auth_alg&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"sha1"&lt;/span&gt;
    &lt;span class="n"&gt;ike_enc_alg&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"aes"&lt;/span&gt;
    &lt;span class="n"&gt;ike_version&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ikev1"&lt;/span&gt;
    &lt;span class="n"&gt;ike_mode&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"main"&lt;/span&gt;
    &lt;span class="n"&gt;ike_lifetime&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;
    &lt;span class="n"&gt;psk&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tunnel1_preshared_key&lt;/span&gt;
    &lt;span class="n"&gt;ike_pfs&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"group2"&lt;/span&gt;
    &lt;span class="n"&gt;ike_local_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;internet_ip&lt;/span&gt;
    &lt;span class="n"&gt;ike_remote_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tunnel1_address&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;ipsec_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ipsec_pfs&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"group2"&lt;/span&gt;
    &lt;span class="n"&gt;ipsec_enc_alg&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"aes"&lt;/span&gt;
    &lt;span class="n"&gt;ipsec_auth_alg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"sha1"&lt;/span&gt;
    &lt;span class="n"&gt;ipsec_lifetime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_connection"&lt;/span&gt; &lt;span class="s"&gt;"ipsec_connection_2"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;                &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"IPSecConnection2"&lt;/span&gt;
  &lt;span class="n"&gt;vpn_gateway_id&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;customer_gateway_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_customer_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_customer_gateway_2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;local_subnet&lt;/span&gt;        &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cidr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="n"&gt;remote_subnet&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cidr&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="n"&gt;effect_immediately&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;
  &lt;span class="n"&gt;ike_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ike_auth_alg&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"sha1"&lt;/span&gt;
    &lt;span class="n"&gt;ike_enc_alg&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"aes"&lt;/span&gt;
    &lt;span class="n"&gt;ike_version&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ikev1"&lt;/span&gt;
    &lt;span class="n"&gt;ike_mode&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"main"&lt;/span&gt;
    &lt;span class="n"&gt;ike_lifetime&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;
    &lt;span class="n"&gt;psk&lt;/span&gt;           &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tunnel2_preshared_key&lt;/span&gt;
    &lt;span class="n"&gt;ike_pfs&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"group2"&lt;/span&gt;
    &lt;span class="n"&gt;ike_local_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;internet_ip&lt;/span&gt;
    &lt;span class="n"&gt;ike_remote_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tunnel2_address&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;ipsec_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ipsec_pfs&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"group2"&lt;/span&gt;
    &lt;span class="n"&gt;ipsec_enc_alg&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"aes"&lt;/span&gt;
    &lt;span class="n"&gt;ipsec_auth_alg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"sha1"&lt;/span&gt;
    &lt;span class="n"&gt;ipsec_lifetime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Although, only a few of the above parameters are mandatory for making the request, have put in the exhaustive list just to give you guys an idea of what the parameters are.&lt;/p&gt;

&lt;h3&gt;
  
  
  VPN Connection Route Entry
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_route_entry"&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_route_entry_1"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;vpn_gateway_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;route_dest&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cidr&lt;/span&gt;
  &lt;span class="n"&gt;next_hop&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ipsec_connection_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;weight&lt;/span&gt;         &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="n"&gt;publish_vpc&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_route_entry"&lt;/span&gt; &lt;span class="s"&gt;"alicloud_vpn_route_entry_2"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;vpn_gateway_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpn_gateway&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;route_dest&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cidr&lt;/span&gt;
  &lt;span class="n"&gt;next_hop&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;alicloud_vpn_connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ipsec_connection_2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="n"&gt;weight&lt;/span&gt;         &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
  &lt;span class="n"&gt;publish_vpc&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The link to the entire code can be found at the &lt;a href="https://github.com/coder006/tf-aws-alicloud-vpn.git"&gt;tf-aws-alicloud-vpn&lt;/a&gt; repository on Github.&lt;/p&gt;

&lt;p&gt;Happy Coding! Cheers :)&lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>terraform</category>
      <category>vpn</category>
    </item>
  </channel>
</rss>
