<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Akash Warkhade</title>
    <description>The latest articles on Forem by Akash Warkhade (@akashw).</description>
    <link>https://forem.com/akashw</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F877025%2F4468ad1a-f65c-4aba-81c6-c31db65e8517.png</url>
      <title>Forem: Akash Warkhade</title>
      <link>https://forem.com/akashw</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/akashw"/>
    <language>en</language>
    <item>
      <title>Achieve Multi-tenancy in Monitoring with Prometheus &amp; Thanos Receiver</title>
      <dc:creator>Akash Warkhade</dc:creator>
      <pubDate>Sat, 24 Jun 2023 14:39:59 +0000</pubDate>
      <link>https://forem.com/akashw/achieve-multi-tenancy-in-monitoring-with-prometheus-thanos-receiver-10gb</link>
      <guid>https://forem.com/akashw/achieve-multi-tenancy-in-monitoring-with-prometheus-thanos-receiver-10gb</guid>
      <description>&lt;p&gt;Hey there! If you are reading this blog post, then I guess you are&lt;br&gt;
already aware of &lt;a href="https://prometheus.io/"&gt;Prometheus&lt;/a&gt; and how it helps&lt;br&gt;
us in &lt;a href="https://dev.to/observability-consulting/"&gt;monitoring distributed systems like Kubernetes&lt;/a&gt;. And if you are familiar with&lt;br&gt;
Prometheus, then chances are that you have come across the tool called&lt;br&gt;
Thanos. &lt;a href="https://thanos.io/"&gt;Thanos&lt;/a&gt; is a popular OSS that helps&lt;br&gt;
enterprises &lt;a href="https://dev.to/prometheus-monitoring-support/"&gt;achieve a HA Prometheus setup&lt;/a&gt; with long-term storage&lt;br&gt;
capabilities. One of the common challenges of distributed monitoring is&lt;br&gt;
to implement multi-tenancy. &lt;a href="https://thanos.io/tip/components/receive.md/"&gt;Thanos&lt;br&gt;
receiver&lt;/a&gt; is a Thanos&lt;br&gt;
component designed to address this common challenge.&lt;br&gt;
&lt;a href="https://thanos.io/tip/components/receive.md/"&gt;Receiver&lt;/a&gt; was part of Thanos for a significant duration as an experimental feature. However, after some time, it reached the general availability stage and is now fully supported.&lt;/p&gt;
&lt;h2&gt;
  
  
  A few words on Thanos Receiver
&lt;/h2&gt;

&lt;p&gt;Receiver is a Thanos component that can accept &lt;a href="https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write"&gt;remote&lt;br&gt;
write&lt;/a&gt;&lt;br&gt;
requests from any Prometheus instance and store the data in its local&lt;br&gt;
TSDB, optionally it can upload those TSDB blocks to an &lt;a href="https://thanos.io/tip/thanos/storage.md/"&gt;object&lt;br&gt;
storage&lt;/a&gt; like S3 or GCS at&lt;br&gt;
regular intervals. Receiver does this by implementing the &lt;a href="https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write"&gt;Prometheus&lt;br&gt;
Remote Write&lt;br&gt;
API&lt;/a&gt;.&lt;br&gt;
It builds on top of existing Prometheus TSDB and retains their&lt;br&gt;
usefulness while extending their functionality with long-term-storage,&lt;br&gt;
horizontal scalability, and down-sampling. It exposes the&lt;br&gt;
&lt;a href="https://github.com/thanos-io/thanos/blob/7ec45ef31170fb37083f75fe45bb45fcabb76e28/pkg/store/storepb/rpc.proto#L27"&gt;StoreAPI&lt;/a&gt;&lt;br&gt;
so that &lt;a href="https://thanos.io/tip/components/query.md/"&gt;Thanos Queriers&lt;/a&gt;&lt;br&gt;
can query received metrics in real-time.&lt;/p&gt;
&lt;h3&gt;
  
  
  Multi-tenancy in Thanos Receiver
&lt;/h3&gt;

&lt;p&gt;Thanos receiver supports multi-tenancy. It accepts Prometheus remote&lt;br&gt;
write requests, and writes these into a local instance of Prometheus&lt;br&gt;
TSDB. The value of the HTTP header (“THANOS-TENANT”) of the incoming&lt;br&gt;
request determines the id of the tenant Prometheus. To prevent data&lt;br&gt;
leaking at the database level, each tenant has an individual TSDB&lt;br&gt;
instance, meaning a single Thanos receiver may manage multiple TSDB&lt;br&gt;
instances. Once the data is successfully committed to the tenant’s TSDB,&lt;br&gt;
the requests return successfully. Thanos Receiver also supports&lt;br&gt;
multi-tenancy by exposing labels which are similar to Prometheus&lt;br&gt;
&lt;a href="https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file"&gt;external&lt;br&gt;
labels&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Hashring configuration file
&lt;/h3&gt;

&lt;p&gt;If we want features like load-balancing and data replication, we can run&lt;br&gt;
multiple instances of Thanos receiver as a part of a single hashring.&lt;br&gt;
The receiver instances within the same hashring become aware of their&lt;br&gt;
peers through a hashring configuration file. Following is an example of&lt;br&gt;
a hashring configuration file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"hashring"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant-a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"endpoints"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"tenant-a-1.metrics.local:19291/api/v1/receive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant-a-2.metrics.local:19291/api/v1/receive"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"tenants"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"tenant-a"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"hashring"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenants-b-c"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"endpoints"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"tenant-b-c-1.metrics.local:19291/api/v1/receive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant-b-c-2.metrics.local:19291/api/v1/receive"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"tenants"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"tenant-b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tenant-c"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"hashring"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"soft-tenants"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"endpoints"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"http://soft-tenants-1.metrics.local:19291/api/v1/receive"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Soft tenancy&lt;/strong&gt; – If a hashring specifies no explicit tenants,
then any tenant is considered a valid match; this allows for a
cluster to provide soft-tenancy. Requests whose tenant ID matches no
other hashring explicitly, will automatically land in this soft
tenancy hashring. All incoming remote write requests which don’t
set the tenant header in the HTTP request, fall under soft tenancy
and default tenant ID (configurable through the flag
–receive.default-tenant-id) is attached to their metrics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hard tenancy&lt;/strong&gt; – Hard tenants must set the tenant header in every
HTTP request for remote write. Hard tenants in the Thanos receiver
are configured in a hashring config file. Changes to this
configuration must be orchestrated by a configuration management
tool. When a remote write request is received by a Thanos receiver,
it goes through the list of configured hard tenants. A hard tenant
also has the number of associated receiver endpoints belonging to
it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;P.S: A remote write request can be initially received by any receiver&lt;br&gt;
instance, however, will only be dispatched to receiver endpoints that&lt;br&gt;
correspond to that hard tenant.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;In this blog post, we are trying to implement the following&lt;br&gt;
architecture. We will use Thanos v0.31.0 in this blog post.&lt;/p&gt;

&lt;p&gt;&lt;a href="/assets/img/Blog/multi-tenancy-monitoring-thanos-receiver/multi-tenancy-model.png" class="article-body-image-wrapper"&gt;&lt;img src="/assets/img/Blog/multi-tenancy-monitoring-thanos-receiver/multi-tenancy-model.png" alt="A simple multi-tenanct monitoring model with prometheus and thanos receive"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;A simple multi-tenancy model with Thanos Receiver&lt;/center&gt;

&lt;p&gt;Brief overview on the above architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;We have 3 Prometheuses running in namespaces: &lt;code&gt;sre&lt;/code&gt;, &lt;code&gt;tenant-a&lt;/code&gt; and&lt;br&gt;
&lt;code&gt;tenant-b&lt;/code&gt; respectively.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Prometheus in &lt;code&gt;sre&lt;/code&gt; namespace is demonstrated as a soft-tenant&lt;br&gt;
therefore it does not set any additional HTTP headers to the remote&lt;br&gt;
write requests.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Prometheuses in &lt;code&gt;tenant-a&lt;/code&gt; and &lt;code&gt;tenant-b&lt;/code&gt; are demonstrated as&lt;br&gt;
hard tenants. The NGINX servers in those respective namespaces are&lt;br&gt;
used for setting tenant header for the tenant Prometheus.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;From security point of view we are only exposing the Thanos receiver&lt;br&gt;
statefulset responsible for the soft-tenant (sre Prometheus).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For both Thanos receiver statefulsets (soft and hard) we are setting&lt;br&gt;
a &lt;a href="https://github.com/infracloudio/thanos-receiver-demo/blob/main/manifests/thanos-receive-hashring-0.yaml#L35"&gt;replication&lt;br&gt;
factor=2&lt;/a&gt;.&lt;br&gt;
This would ensure that the incoming data get replicated between two&lt;br&gt;
receiver pods.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The remote write request which is received by the &lt;a href="https://github.com/infracloudio/thanos-receiver-demo/blob/main/manifests/thanos-receive-default.yaml"&gt;soft tenant&lt;br&gt;
receiver&lt;/a&gt;&lt;br&gt;
instance is forwarded to the &lt;a href="https://github.com/infracloudio/thanos-receiver-demo/blob/main/manifests/thanos-receive-hashring-0.yaml"&gt;hard tenant thanos&lt;br&gt;
receiver&lt;/a&gt;.&lt;br&gt;
This routing is based on the hashring config.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The above architecture obviously misses few features that one would also&lt;br&gt;
expect from a multi-tenant architecture, e.g: tenant isolation,&lt;br&gt;
authentication, etc. This blog post only focuses how we can use the&lt;br&gt;
Thanos Receiver to store time-series from multiple Prometheus(es) to&lt;br&gt;
achieve multi-tenancy. Also the idea behind this setup is to show how we&lt;br&gt;
can make the prometheus on the tenant side nearly stateless yet maintain&lt;br&gt;
data resiliency.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;We will improve this architecture, in the upcoming posts. So, stay&lt;br&gt;
tuned.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://kind.sigs.k8s.io/docs/user/quick-start/"&gt;KIND&lt;/a&gt; / managed
cluster / minikube (We will be using Kind)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;kubectl&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;helm&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;jq(optional)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Cluster setup
&lt;/h2&gt;

&lt;p&gt;Clone the repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt; git clone https://github.com/infracloudio/thanos-receiver-demo.git 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Setup a local &lt;a href="https://kind.sigs.k8s.io/docs/user/quick-start/"&gt;KIND&lt;/a&gt; cluster
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Move to the &lt;code&gt;local-cluster&lt;/code&gt; directory:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd local-cluster/
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create the cluster with calico, ingress and extra-port mappings:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./create-cluster.sh cluster-1 kind-calico-cluster-1.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Deploy the nginx ingress controller:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm &lt;span class="nb"&gt;install &lt;/span&gt;nginx-controller ingress-nginx/ingress-nginx &lt;span class="nt"&gt;-n&lt;/span&gt; ingress-nginx &lt;span class="nt"&gt;--create-namespace&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Now, move back to the root directory of the repo:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; -
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Install minio as object storage
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; &lt;code&gt;helm repo add bitnami https://charts.bitnami.com/bitnami&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Install minio in the cluster:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm upgrade &lt;span class="nt"&gt;--install&lt;/span&gt; my-minio bitnami/minio &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; ingress.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="nt"&gt;--set&lt;/span&gt; auth.rootUser&lt;span class="o"&gt;=&lt;/span&gt;minio &lt;span class="nt"&gt;--set&lt;/span&gt; auth.rootPassword&lt;span class="o"&gt;=&lt;/span&gt;minio123 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; minio &lt;span class="nt"&gt;--create-namespace&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;3.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;    kubectl port-forward svc/my-minio 9001:9001 &lt;span class="nt"&gt;-n&lt;/span&gt; minio
    &lt;span class="sb"&gt;```&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;% endraw %&lt;span class="o"&gt;}&lt;/span&gt;

4.  &lt;span class="k"&gt;if &lt;/span&gt;you face  &lt;span class="k"&gt;**&lt;/span&gt;E0528 13:02:43.145873   44832 portforward.go:346] error creating error stream &lt;span class="k"&gt;for &lt;/span&gt;port 9001 -&amp;gt; 9001: Timeout occurred&lt;span class="k"&gt;**&lt;/span&gt; issue &lt;span class="k"&gt;while &lt;/span&gt;doing port-forward &lt;span class="k"&gt;for &lt;/span&gt;minio &lt;span class="k"&gt;then &lt;/span&gt;execute above port-forward &lt;span class="nb"&gt;command &lt;/span&gt;&lt;span class="k"&gt;in &lt;/span&gt;one terminal and &lt;span class="k"&gt;then &lt;/span&gt;open another terminal and execute below &lt;span class="nb"&gt;command &lt;/span&gt;which will keep connection to minio pods keep running.
&lt;span class="o"&gt;{&lt;/span&gt;% raw %&lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="sb"&gt;```&lt;/span&gt;sh
    &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do &lt;/span&gt;wget 127.0.0.1:9001 &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;sleep &lt;/span&gt;10 &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;done&lt;/span&gt;
    &lt;span class="sb"&gt;```&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;% endraw %&lt;span class="o"&gt;}&lt;/span&gt;

5.  &lt;span class="k"&gt;then &lt;/span&gt;Login to minio by opening http://localhost:9001/ &lt;span class="k"&gt;in &lt;/span&gt;browser with credentials username minio and password minio123
6.  Create a bucket with name &lt;span class="k"&gt;**&lt;/span&gt;thanos&lt;span class="k"&gt;**&lt;/span&gt; from UI

&lt;span class="c"&gt;### Install Thanos components&lt;/span&gt;

&lt;span class="k"&gt;**&lt;/span&gt;Create shared components&lt;span class="k"&gt;**&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;% raw %&lt;span class="o"&gt;}&lt;/span&gt;


&lt;span class="sb"&gt;```&lt;/span&gt;shell
kubectl create ns thanos

&lt;span class="c"&gt;## Create a file _thanos-s3.yaml_ containing the minio object storage config for tenant-a:&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt; &amp;gt; thanos-s3.yaml
type: S3
config:
  bucket: "thanos"
  endpoint: "my-minio.minio.svc.cluster.local:9000"
  access_key: "minio"
  secret_key: "minio123"
  insecure: true
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="c"&gt;## Create secret from the file created above to be used with the thanos components e.g store, receiver&lt;/span&gt;
kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; thanos create secret generic thanos-objectstorage &lt;span class="nt"&gt;--from-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;thanos-s3.yaml
kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; thanos label secrets thanos-objectstorage part-of&lt;span class="o"&gt;=&lt;/span&gt;thanos

&lt;span class="c"&gt;## go to manifests directory&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ../manifests/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Install Thanos Receive Controller&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Deploy a thanos-receiver-controller to auto-update the hashring&lt;br&gt;
configmap when the thanos receiver statefulset scales:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; thanos-receiver-hashring-configmap-base.yaml
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; thanos-receive-controller.yaml
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;The deployment above would generate a new configmap&lt;br&gt;
&lt;code&gt;thanos-receive-generated&lt;/code&gt; and keep it updated with a list of&lt;br&gt;
endpoints when a statefulset with label:&lt;br&gt;
&lt;code&gt;controller.receive.thanos.io/hashring=hashring-0&lt;/code&gt; and/or&lt;br&gt;
&lt;code&gt;controller.receive.thanos.io/hashring=default&lt;/code&gt; get created or&lt;br&gt;
updated. The thanos receiver pods would load the&lt;br&gt;
&lt;code&gt;thanos-receive-generated&lt;/code&gt;configmaps in them.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;**NOTE&lt;/em&gt;&lt;em&gt;: The **default&lt;/em&gt;* and &lt;strong&gt;hashring-0&lt;/strong&gt; hashrings would be&lt;br&gt;
responsible for the soft-tenancy and hard-tenancy respectively.*&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Install Thanos Receiver&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Create the thanos-receiver statefulsets and headless services for&lt;br&gt;
soft and hard tenants.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;We are not using persistent volumes just for this demo.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; thanos-receive-default.yaml 
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; thanos-receive-hashring-0.yaml
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;&lt;em&gt;The receiver pods are configured to store 15d of data and with replication factor of 2&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create a service in front of the thanos receiver statefulset for the&lt;br&gt;
soft tenants.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; thanos-receive-service.yaml
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;&lt;em&gt;The pods of **thanos-receive-default&lt;/em&gt;* statefulset would&lt;br&gt;
load-balance the incoming requests to other receiver pods based on&lt;br&gt;
the hashring config maintained by the thanos receiver controller.*&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Install Thanos Store&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a &lt;a href="https://thanos.io/tip/components/store.md/"&gt;thanos store&lt;/a&gt; statefulsets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; thanos-store-shard-0.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We have configured it such that the thanos querier fans out queries to the store only for data older than 2w. Data earlier than 15d are to be provided by the receiver pods. P.S: There is a overlap of 1d between the two time windows is intentional for data-resiliency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Install Thanos Querier&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a thanos querier deployment, expose it through service and ingress.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; thanos-query.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We configure the thanos query to connect to receiver(s) and store(s) for fanning out queries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install Prometheus(es)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Create shared resource&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl create ns sre
kubectl create ns tenant-a
kubectl create ns tenant-b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Install kube-prometheus-stack
&lt;/h3&gt;

&lt;p&gt;We install the&lt;br&gt;
&lt;a href="https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack"&gt;kube-prometheus-stack&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm upgrade &lt;span class="nt"&gt;--namespace&lt;/span&gt; sre &lt;span class="nt"&gt;--debug&lt;/span&gt; &lt;span class="nt"&gt;--install&lt;/span&gt; cluster-monitor prometheus-community/kube-prometheus-stack &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; prometheus.ingress.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; prometheus.ingress.hosts[0]&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"cluster.prometheus.local"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; prometheus.prometheusSpec.remoteWrite[0].url&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"http://thanos-receive.thanos.svc.cluster.local:19291/api/v1/receive"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; alertmanager.ingress.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; alertmanager.ingress.hosts[0]&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"cluster.alertmanager.local"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; grafana.ingress.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; grafana.ingress.hosts[0]&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"grafana.local"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Install Prometheus and ServiceMonitor for tenant-a
&lt;/h3&gt;

&lt;p&gt;In &lt;em&gt;tenant-a&lt;/em&gt; namespace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Deploy a nginx proxy to forward the requests from prometheus to &lt;em&gt;thanos-receive&lt;/em&gt; service in &lt;em&gt;thanos&lt;/em&gt; namespace. It also sets the tenant header of the outgoing request&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; nginx-proxy-a.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create a&lt;br&gt;
&lt;a href="https://cloud.redhat.com/learn/topics/operators#prometheus"&gt;prometheus&lt;/a&gt; and a &lt;a href="https://cloud.redhat.com/learn/topics/operators#servicemonitor"&gt;servicemonitor&lt;/a&gt; to monitor itself&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; prometheus-tenant-a.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Install Prometheus and ServiceMonitor for tenant-b
&lt;/h4&gt;

&lt;p&gt;In &lt;em&gt;tenant-b&lt;/em&gt; namespace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Deploy a nginx proxy to forward the requests from prometheus to&lt;br&gt;
&lt;em&gt;thanos-receive&lt;/em&gt; service in &lt;em&gt;thanos&lt;/em&gt; namespace. It also sets the&lt;br&gt;
tenant header of the outgoing request&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; nginx-proxy-b.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create a prometheus and a servicemonitor to monitor itself&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; prometheus-tenant-b.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Test the setup
&lt;/h2&gt;

&lt;p&gt;Access the thanos querier by port-forwarding thanos-query service&lt;br&gt;
&lt;br&gt;
  &lt;code&gt;kubectl port-forward svc/thanos-query 9090:9090 -n thanos&lt;/code&gt;&lt;br&gt;
&lt;br&gt;
 and open the thanos query UI by opening &lt;a href="http://localhost:9090/"&gt;http://localhost:9090/&lt;/a&gt; in the browser,&lt;br&gt;
execute the query &lt;code&gt;count(up) by (tenant_id)&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;Otherwise, if we have &lt;a href="https://stedolan.github.io/jq/"&gt;&lt;code&gt;jq&lt;/code&gt;&lt;/a&gt;&lt;br&gt;
installed, you can run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://localhost:9090/api/v1/query?query&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"count(up)by("&lt;/span&gt;tenant_id&lt;span class="s2"&gt;")"&lt;/span&gt;|jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.data.result[]|"\(.metric) \(.value[1])"'&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"tenant_id"&lt;/span&gt;:&lt;span class="s2"&gt;"a"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; 1
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"tenant_id"&lt;/span&gt;:&lt;span class="s2"&gt;"b"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; 1
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"tenant_id"&lt;/span&gt;:&lt;span class="s2"&gt;"cluster"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt; 17
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Either of the above outputs show that, &lt;em&gt;cluster&lt;/em&gt;, &lt;em&gt;a&lt;/em&gt; and &lt;em&gt;b&lt;/em&gt; prometheus&lt;br&gt;
tenants are respectively having 17, 1 and 1 scrape targets up and&lt;br&gt;
running. All these data are getting stored in thanos-receiver in real&lt;br&gt;
time by prometheus’ &lt;a href="https://prometheus.io/docs/practices/remote_write/#remote-write-characteristics"&gt;remote write&lt;br&gt;
queue&lt;/a&gt;.&lt;br&gt;
This model creates an opportunity for the tenant side prometheus to be&lt;br&gt;
nearly stateless yet maintain data resiliency.&lt;/p&gt;

&lt;p&gt;In our next post, we would improve this architecture to enforce tenant&lt;br&gt;
isolation on thanos-querier side.&lt;/p&gt;

&lt;p&gt;I hope you found this blog informative and engaging. If you encounter&lt;/p&gt;

&lt;h4&gt;
  
  
  References:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://thanos.io/tip/components/receive.md/"&gt;Thanos Receiver&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/observatorium/thanos-receive-controller"&gt;Github thanos-receive-controller&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/prometheus-monitoring-support/"&gt;Prometheus consulting and enterprise support capabilities&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>devops</category>
      <category>kubernetes</category>
      <category>thanos</category>
      <category>prometheus</category>
    </item>
    <item>
      <title>Testing your Infrastructure as Code using Terratest</title>
      <dc:creator>Akash Warkhade</dc:creator>
      <pubDate>Wed, 15 Jun 2022 10:49:34 +0000</pubDate>
      <link>https://forem.com/akashw/testing-your-infrastructure-as-code-using-terratest-3hb6</link>
      <guid>https://forem.com/akashw/testing-your-infrastructure-as-code-using-terratest-3hb6</guid>
      <description>&lt;p&gt;Setting Up infrastructure manually can be a time-consuming and hectic process. That is when we can make use of Infrastructure as Code (IaC) tools to automate the infrastructure. IaC automation can be done for any kind of Infrastructure i.e virtual machine, storage, etc. As more and more infrastructure becomes code, it is essential to have unit and integration tests for your IaC. We will briefly discuss what is IaC and testing your Infrastructure code mean. Then we deep dive into how we can use Terratest for IaC testing.&lt;/p&gt;

&lt;p&gt;Let’s begin, shall we?&lt;/p&gt;

&lt;h1&gt;
  
  
  Infrastructure as code (IaC)
&lt;/h1&gt;

&lt;p&gt;Infrastructure as Code is the process of provisioning and configuring an environment through code instead of manually setting up the required infrastructure and supporting system for it through GUI. For example, provisioning a virtual machine, configuring it, and setting up monitoring for it. Some of the IaC examples are Terraform, Packer, Ansible, etc. With the help of infrastructure as code, you can also track your infrastructure into a version control system such as Git, modularize and templatize in order to reuse the same code for multiple environments, and regions. Disaster recovery is one of the important benefits you get from coding your infrastructure. With IaC, you can replicate your Infrastructure in other regions or environments as quickly as possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing infrastructure code
&lt;/h2&gt;

&lt;p&gt;IaC testing can be divided into multiple stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Unit testing&lt;/li&gt;
&lt;li&gt;Integration testing&lt;/li&gt;
&lt;li&gt;Sanity or Static Analysis&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Sanity or Static Analysis
&lt;/h3&gt;

&lt;p&gt;This is the very initial phase of testing your infrastructure code. In static analysis, we ensure that we have the correct syntax for our code. It also helps to ensure that our code is as per the industry standards and follows the best practices. Linters fall into this category. Some examples of sanity testing tools are foodcritic for Chef, hadolint for Docker, tflint for Terraform, etc.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unit Testing
&lt;/h3&gt;

&lt;p&gt;With the help of unit testing, we assess our code without actually provisioning the infrastructure. Examples can be restricting your container to run as a non-root user, or your cloud network security group should only have TCP protocols. Some of the unit testing examples are Conftest for Terraform, Chefspecs for Chef Cookbooks.&lt;/p&gt;

&lt;p&gt;Conftest example for executing as a non root user:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package main

deny[msg] {
  input.kind == "Deployment"
  not input.spec.template.spec.securityContext.runAsNonRoot

  msg := "Containers must not run as root"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Integration testing
&lt;/h3&gt;

&lt;p&gt;In integration testing, we want to test our IaC by actually deploying it into the required environment. For example, you deployed a virtual machine and hosted an Nginx server on port 80 on that machine. So you will check if port 80 is listening after deployment.&lt;/p&gt;

&lt;p&gt;Below is the example of doing that with ServerSpec:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;describe port(80) do
  it { should be_listening }
end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this post, we are exploring integration testing of infrastructure code using Terrratest.&lt;/p&gt;

&lt;h1&gt;
  
  
  What is Terratest? What can we achieve with it?
&lt;/h1&gt;

&lt;p&gt;Terratest is a Go library developed by Gruntwork that helps you create and automate tests for your Infra as Code written with Terraform, Packer for IaaS providers like Amazon, Google, or for a Kubernetes cluster. It provides you with various functions and patterns for tasks such as:&lt;/p&gt;

&lt;p&gt;Testing Docker images, Helm charts, Packer templates.&lt;br&gt;
Allows to work with various cloud provider APIs such as AWS, Azure.&lt;br&gt;
Terratest executes sanity and functional testing for your Infrastructure code. With Terratest you can easily identify issues in your current infrastructure code and fix the issue as soon as possible. We can also leverage Terratest for compliance testing of your infrastructure, for example, to have versioning and encryption enabled on any new S3 bucket created through your IaC.&lt;/p&gt;
&lt;h2&gt;
  
  
  Installation of required binaries for Terratest
&lt;/h2&gt;

&lt;p&gt;Terratest mainly requires Terraform and Go for execution. In this blog post, we have used Terraform version 1.0.0 and Go version 1.17.6 for our testing.&lt;/p&gt;
&lt;h3&gt;
  
  
  Installing Terraform
&lt;/h3&gt;

&lt;p&gt;Follow the downloads section from Terraform website &lt;a href="https://www.terraform.io/downloads"&gt;at&lt;/a&gt;  to install Terraform on your machine, you can use package manager or download the binary and make it available in PATH.&lt;/p&gt;

&lt;p&gt;After installation verify if it is installed properly by running the below command:&lt;/p&gt;

&lt;p&gt;terraform version&lt;br&gt;
Go &amp;amp; test dependency installation can be done with the following steps:&lt;/p&gt;
&lt;h3&gt;
  
  
  Installing Go
&lt;/h3&gt;

&lt;p&gt;You can use your Linux distribution’s package manager to install Go, or follow the installation documentation of Go &lt;a href="https://go.dev/doc/install"&gt;at&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Go test requires gcc for test execution&lt;br&gt;
The go test command might require gcc, you can install it using your distribution’s package manager. For example, on CentOS/Amazon Linux 2, you can use yum install -y gcc.&lt;/p&gt;
&lt;h1&gt;
  
  
  Terratest in action
&lt;/h1&gt;

&lt;p&gt;Now we will execute some integration tests using terratest. Once the installation steps are complete, clone the terratest-sample repository located at &lt;a href="https://github.com/akash123-eng/terratest-sample"&gt;link&lt;/a&gt;  to start executing teratest. &lt;br&gt;
We’ll start by writing the test using Go and execute it.&lt;/p&gt;

&lt;p&gt;First things first:&lt;/p&gt;

&lt;p&gt;1.Your test file name should have _test in its name for example sample_test.go. This is how a Go looks for the test files.&lt;br&gt;
2.Your test function name should start with Test with T being in capital letter. For example, TestFunction would work but testFunction will give you an error “no tests to run”.&lt;/p&gt;
&lt;h3&gt;
  
  
  Setup AWS auth configuration
&lt;/h3&gt;

&lt;p&gt;We need AWS credentials to set up the infrastructure in AWS, we can configure it using environment variables or shared credentials file. Refer to the Terraform &lt;a href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs#authentication-and-configuration"&gt;documentation&lt;/a&gt; for more details.&lt;/p&gt;

&lt;p&gt;Terraform code for infrastructure can be found at the respective folder of the component For ec2, it’s under ec2_instance, and for API gateway, it’s under api_gateway folder. Terratest takes the output from Terraform’s output.tf as input for its tests. &lt;br&gt;
Below is the snippet for testing if we have the same ssh key on ec2 instance we have used.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package terratest

import (
   "testing"
   "github.com/stretchr/testify/assert"
   "github.com/gruntwork-io/terratest/modules/terraform"
)

func TestEc2SshKey(t *testing.T) {
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &amp;amp;terraform.Options{
        TerraformDir: "../terraform",
    })
    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)
    ec2SshKey  := terraform.Output(t, terraformOptions, "instance_ssh_key")
    assert.Equal(t, "terratest", ec2SshKey)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We will divide it into different parts for proper understanding: In the first step we are defining a Go package named as terratest, and then we are importing different packages required for test execution.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;package terratest

import (
   "testing"
   "github.com/stretchr/testify/assert"
   "github.com/gruntwork-io/terratest/modules/terraform"
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once we have all the prerequisites, we will create a function to execute actual test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func TestEc2SshKey(t *testing.T) {
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &amp;amp;terraform.Options{
        TerraformDir: "../terraform",
    })
    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)
    ec2SshKey  := terraform.Output(t, terraformOptions, "instance_ssh_key")
    assert.Equal(t, "terratest", ec2SshKey)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With below section, we are defining directory where terratest should look for Terraform manifests i.e main.tf,output.tf for infrastructure creation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; terraformOptions := terraform.WithDefaultRetryableErrors(t, &amp;amp;terraform.Options{
     TerraformDir: "../terraform",
 })
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Go we use defer method to perform a cleanup task, it should be terraform destroy.&lt;br&gt;
We are defining that using the below snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;defer terraform.Destroy(t, terraformOptions)
Now we can move forward to actual execution:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With terraform.InitAndApply we are invoking Terraform functions terraform init and apply which we generally use for Terraform execution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   terraform.InitAndApply(t, terraformOptions)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As mentioned earlier, Terratest looks for output from output.tf for variable definition.&lt;/p&gt;

&lt;p&gt;In the below snippet we are taking the ssh key from Terraform output and matching with ssh key name we have defined:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    ec2SshKey  := terraform.Output(t, terraformOptions, "instance_ssh_key")
    assert.Equal(t, "terratest", ec2SshKey)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Executing tests&lt;br&gt;
Switch your directory to the location where you have cloned the repository. Navigate to the location where you have test files located.&lt;/p&gt;

&lt;p&gt;Initialize Go modules, and download the dependencies. Take a look at Setting up your project section of Terratest &lt;a href="https://terratest.gruntwork.io/docs/getting-started/quick-start/#setting-up-your-project"&gt;documentation&lt;/a&gt; for more details.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;go mod init ec2_instance
go mod tidy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally execute test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ go test -v

--- PASS: TestEc2SshKey (98.72s)
PASS
ok      command-line-arguments  98.735s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Let’s go a bit advance with Terratest
&lt;/h1&gt;

&lt;p&gt;In the previous section, we have performed some basic level of testing using Terratest. Now, we will perform advanced test by deploying a API Gateway with Lambda and ALB as backend.&lt;/p&gt;

&lt;h3&gt;
  
  
  High-level functionality
&lt;/h3&gt;

&lt;p&gt;GET request for API Gateway will be served by ALB and ANY method will be served by Lambda through API Gateway. After deployment, we will do a HTTP GET request against the gateway deployment URL and check if it’s returning success code.&lt;/p&gt;

&lt;p&gt;Note: In our execution we are not using any API_KEY for authentication, but you should utilize it to replicate more realistic use of API Gateway.&lt;/p&gt;

&lt;h3&gt;
  
  
  Terraform output.tf
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;output "lb_address" {
  value = aws_lb.load-balancer.dns_name
  description = "DNS of load balancer"
}


output "api_id" {
  description = "REST API id"
  value       = aws_api_gateway_rest_api.api.id
}

output "deployment_invoke_url" {
  description = "Deployment invoke url"
  value       = "${aws_api_gateway_stage.test.invoke_url}/resource"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Code snippet for test execution
&lt;/h3&gt;

&lt;p&gt;In the first scenario we have explained the basic syntax, so will directly go for test function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func TestApiGateway(t *testing.T) {
    //awsRegion := "eu-west-2"
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &amp;amp;terraform.Options{
        TerraformDir: "../",
    })
    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)
    stageUrl := terraform.Output(t, terraformOptions,"deployment_invoke_url")
    time.Sleep(30 * time.Second)
    statusCode := DoGetRequest(t, stageUrl)
    assert.Equal(t, 200 , statusCode)
}

func DoGetRequest(t terra_test.TestingT, api string) int{
   resp, err := http.Get(api)
   if err != nil {
      log.Fatalln(err)
   }
   //We Read the response status on the line below.
   return resp.StatusCode
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the above snippet, we have defined a function DoGetRequest to run a HTTP GET test. Then we used the output of this function as an input for the TestApiGateway function.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test execution and output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TestApiGateway 2022-03-01T06:56:18Z logger.go:66: deployment_invoke_url = "https://iuabeqgmj2.execute-api.eu-west-1.amazonaws.com/test/resource"
TestApiGateway 2022-03-01T06:56:18Z logger.go:66: lb_address = "my-demo-load-balancer-376285754.eu-west-1.elb.amazonaws.com"
TestApiGateway 2022-03-01T06:56:18Z retry.go:91: terraform [output -no-color -json deployment_invoke_url]
TestApiGateway 2022-03-01T06:56:18Z logger.go:66: Running command terraform with args [output -no-color -json deployment_invoke_url]
TestApiGateway 2022-03-01T06:56:19Z logger.go:66: "https://iuabeqgmj2.execute-api.eu-west-1.amazonaws.com/test/resource"
--- PASS: TestApiGateway (42.34s)
PASS
ok      command-line-arguments  42.347s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see it has executed our test function TestApiGateway, in which it has performed HTTP GET test on deployment_invoke_url of API Gateway and returned test status.&lt;/p&gt;

&lt;h1&gt;
  
  
  Terratest modules extensibility and compliance testing using Terratest
&lt;/h1&gt;

&lt;p&gt;We can utilize Terratest for compliance testing also. Some of the examples can be:&lt;/p&gt;

&lt;p&gt;Check if Encryption is enabled on your SQS Queue or S3 bucket.&lt;br&gt;
Verify if you have a particular throttling limit set for API gateway.&lt;br&gt;
We have developed a Terratest check for API gateway. &lt;br&gt;
In this example, we are verifying if we have Authorizer added for your API gateway. You can find out more on what are authorizers&lt;br&gt;
at &lt;a href="https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-use-lambda-authorizer.html"&gt;Document&lt;/a&gt; .&lt;/p&gt;

&lt;p&gt;Currently Terratest does not have an API Gateway module in their AWS modules. You can find out available AWS modules at &lt;a href="https://github.com/gruntwork-io/terratest/tree/master/modules/aws"&gt;Terratest aws modules directory&lt;/a&gt;. Other Terratest modules such as Docker, Packer, Helm can be found at Terratest modules &lt;a href="https://github.com/gruntwork-io/terratest/tree/master/modules"&gt;directory&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We have created our own test function for Authorizer using Terratest and AWS Go SDK methods at &lt;a href="https://github.com/akash123-eng/terratest-sample/blob/main/api_gateway/terratest/authorizer_test.go"&gt;Function_URL&lt;/a&gt;.&lt;br&gt;
More information on how to use AWS Go SDK: &lt;a href="https://docs.aws.amazon.com/sdk-for-go/api/"&gt;aws_go_sdk&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Enterprises and their customers want products to be shipped faster. Infrastructure as Code provides just that with faster provisioning of infrastructure. As more and more infrastructure becomes code, the need for testing increases. In this post, we discussed how a tool like Terratest can help validate your code before you deploy it to production. We showed how Terratest works and even executed test cases to show how it’s done. One of the good things about Terratest is its extensibility that can be achieved by means of using modules as talked about in the post.&lt;/p&gt;

&lt;p&gt;That’s all for this post. If you are working with Terratest or plan to use it and need some assistance, feel free to reach out to me via LinkedIn. We’re always excited to hear your thoughts and support!&lt;/p&gt;

</description>
      <category>iac</category>
      <category>testing</category>
      <category>devops</category>
      <category>terratest</category>
    </item>
  </channel>
</rss>
