Forem: Ankit Bansal

Testing strategy to achieve continuous delivery in Microservices - Part 2

Ankit Bansal — Sat, 04 Jul 2020 07:00:00 +0000

In Part 1 we discussed how to build our test strategy for continuous delivery in a microservice architecture. In this part, we will discuss in detail what each kind of test should cover.

Unit Tests

Unit tests are supposed to test one layer at a time. Any dependencies with in the class should be mocked or stubbed. They are very fast to run and provide immediate feedback. Also, they are faster to debug and identify issues. I suggest to employ Test Driven Development (TDD) to write high quality unit tests. Other than verifying the behavior, unit tests provide code documentation and improve code modularity.

Following are some of the best practices that can be employed while writing unit tests:

Write separate unit tests for each scenario with in method.
Test name should be descriptive and clearly specify intent.
Don’t write separate tests for private methods. If there is too much logic getting tested in a method, evaluate how code can be modularized further.
No need to write separate tests for Pojos and utility classes. These should be covered with in the service classes tests.
Avoid writing tests for generated code. However, you can keep few test cases to verify intended behavior.

Integration Tests

Since unit tests only cover one layer at a time, integration tests are required to ensure that multiple layers together are working as expected. Integration tests are different than end to end tests as they don’t try to cover the whole integration but only one or two layers integration at a time.

In microservice, I suggest to employ integration tests at Resource/Controller level:

Have api tests for each api covering atleast one success and one or more failure scenario.
Mock external services utilizing REST contracts for providing their behavior.
Better not to mock database layer

Some people suggests to have integration tests only and avoid unit tests for microservices as the services are usually small and have crud behavior mostly. From my experience, I see services might initially be simple crud but they usually grow and then it becomes difficult to manage service based on integration tests only. So, depending on service scope, this call can be taken. I believe it depends on service to service. For some microservices, this can be done and indeed a better option to have.

Contract Tests

With unit tests and integration tests, we covered most of the code written by us. However, we are relying on mock contracts for internal and external services. To ensure provider service doesn’t make any change that can break our service, contract tests need to be written. Contract testing should check for success and error response verifying the returned object is as per the contract while avoiding too much of business validation testing. Also, it should avoid benchmarking of service. These tests can be run on scheduled basis.

Another option is to have consumer driven contracts (CDC). In this consumer service provides a test suite to provider. Provider service can run this suite before going live to ensure its new changes don’t break consumer.

Service Api Tests

One of the critical piece of testing in microservices world is api testing for the service. Though we have covered service via unit and integration testing, the end behavior of service needs to be verified. Most of the times different teams are working on different microservices within a product and it does make sense to have well defined contracts between them. Also, many times the microservices are directly exposed to the external world. So, an end to end testing must be performed for the microservices to verify desired behavior.

Service api testing doesn’t rely on mocks but instead verify the actual behavior of service. Sometimes few services are working together for a functionality as a cluster and are released together, in those cases we can define service api tests for those services together.

End to End tests

End to end tests also called functional tests cover the whole system, user journeys and verifies the system is behaving correctly end to end. They give the maximum confidence as they test the complete integration. Since they are mostly written by QA, they also provide another eye to the functionality developed. Functional tests cover the user journeys and verify how the system is behaving with regard to user. They give the most confidence when we need to decide whether our code is ready to be shipped. However, these are slow to run and thus it might not be possible to run them all before making any new release. A good functional test suite should ensure to cover all the major user journeys while relying on other layers to cover the detailed scenarios .

Non functional Tests

Another category of tests include non functional tests. Theses include performance testing, load testing, security testing etc. Depending on the application requirement, these can be built and integrated in the pipeline.

Testing strategy to achieve continuous delivery in Microservices

Ankit Bansal — Thu, 28 May 2020 07:00:00 +0000

Continuous delivery is widely popular practice today. Product team wants to release new features as soon as possible focusing on the minimum viable product (MVP). The idea is to get faster feedback. Also, any bug reported from the clients should be addressed asap to keep customers happy. Continuous delivery is the ability to make this happen while ensuring high quality, low risk and reliable releases. This allows team to deploy multiple releases in a week or even day.

One of the core foundations of continuous delivery is automated tests. These are the set of tests that are run as part of the continuous pipeline to ensure the code is working as per expectations. Some people consider only functional tests as automation tests. However, automation tests are any kind of tests written that can be run automatically e.g. unit, integration, api, functional, load tests etc. With so many types of tset, one question comes is how should we employ our testing strategy to ensure bug free release and how the strategy is relevant in microservices world.

Microservice Architecture

Microservice architecture is an increasing popular architecture to build software. It proposes to create smaller services with well defined boundaries that allow teams to concentrate on their service only and release faster. The idea is to be able to independently release your service. While microservice is ideal for continuous delivery, one of the major challenge is most of the time any functionality modification depends on changes in multiple services. Thus we need to ensure that services once integrated doesn’t lead to any regression issue. Traditionally, companies have relied on comprehensive functional test suite and manual testing to verify end to end behavior but does that strategy work in case of microservices? Let’s discuss some of the major issues with this and how can we employ continuous testing to achieve continuous delivery.

Issues with large functional suite

Most companies focus on building high quality comprehensive functional suite for regression testing. Functional suite tests user journey and verify that product end to end is working as per expectations. No doubt functional tests provide maximum confidence on the build and are must to have. However, there are couple of issues on heavily relying on them.

Run at end of cycle: Functional tests run at the end of cycle. Due to this many of the integration and functional issues are identified only at later stage. This leads to delayed release. One of the principle of continuous delivery is to fail early which should be employed for timely and fast releases.
Time consuming: Functional tests take lot of time to run since they run in browser and cover user journey. Also, many of the actions are async in nature in which case there is lot of wait time involved. I have seen functional suites running even for a day.
Flakiness: The other issues with functional suite is they can fail due to many reasons outside of our control like slow network etc. and this leads to flakiness in the tests.

Thus relying only on functional tests should be avoided. Instead, teams should employ test pyramid idea.

Test Pyramid

Test Pyramid is not a new concept but still many companies struggle to adopt it. Test pyramid suggests to have tests at different granularity and the more high level you go in the pyramid, the lesser tests you should have. So have lot of unit tests to cover the whole functionality. Then you should have good amount of integration and service tests to test layer integration and finally have smaller number of functional tests to cover end to end functionality. These tests are run at different stages in a pipeline and overall certify build.

When we consider microservices, we first need to ensure that our service is self-tested. Having a good quality unit and integration tests help us in this regard. But usually we are mocking external services in these tests. One of the critical piece in microservice is api tests. Api tests can cover individual service or a cluster of service together. These allows us to ensure that service integration works fine and the contract is not broken. It is also critical when the service is a public service. Finally we have functional tests to cover the end to end user journeys. These tests should ensure that the major use cases are tested when a new release is going live.

In the next part, I will discuss what each of these tests should cover and how they work together to achieve faster and bug free releases.

Centralized logging in Oracle Kubernetes Engine with Object Storage

Ankit Bansal — Tue, 09 Apr 2019 03:30:00 +0000

There are various ways to capture logs in kubernetes. One of the simplest mechanism is to have pod level script that can upload logs to destination system. This approach can work when you are trying out kubernetes but as soon as you decide to use kubernetes in production and have multiple applications to deploy, you see a need of having a centralized logging mechanism which is independent of pod lifecycle.

We had the requirement to set up logging for our kubernetes cluster. Cluster is built on Oracle Kubernetes Engine (OKE) and we wanted to persist logs in OCI Object Storage. First step was to find out an efficient way of capturing logs. Looking into multiple options, we found Daemonset to be a great option due to following reasons:

Kubernetes ensures that every node runs a copy of Pod. Any new node gets added in the cluster, kubernetes automatically ensures to bring up a pod on the node. This can be further customized to only choose nodes based on selection criteria.
It avoids changing individual application deployment. If later you want to change your logging mechanism, you only need to change your Daemonset
There is no impact on performance of application due to log capturing as it’s running outside pod.

Once finalized, next step was to configure Daemonset to capture logs from OKE and publish them to OCI Object Storage. We have used fluentd before and we decided to go ahead with it. Fluentd already have image for configuring daemonset and upload to s3. Since object storage is compatible with S3 API, we were able to use it with some customizations of fluent.conf.

Setting up cluster role

Fluentd daemonset requires to run in kube-system. First step is to create a new account and providing it required privileges.

Create Service Account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-system

kubectl create -f serviceaccount.yaml

Create Cluster Role:

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: fluentd
  namespace: kube-system
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch

kubectl create -f clusterrole.yaml

Create binding for cluster role with account:

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-system

kubectl create -f clusterrolebinding.yaml

Create Daemonset

Next, we create config map to provide custom fluent configuration:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-config
  namespace: kube-system
data:
  fluent.conf: |
    @include "#{ENV['FLUENTD_SYSTEMD_CONF'] || '../systemd'}.conf"
    @include "#{ENV['FLUENTD_PROMETHEUS_CONF'] || '../prometheus'}.conf"
    @include ../kubernetes.conf
    @include conf.d/*.conf

    <match **>
      @type s3
      @id out_s3
      @log_level info
      s3_bucket "#{ENV['S3_BUCKET_NAME']}"
      s3_endpoint "#{ENV['S3_ENDPOINT']}"
      s3_region "#{ENV['S3_BUCKET_REGION']}"
      s3_object_key_format %{path}%Y/%m/%d/cluster-log-%{index}.%{file_extension}
      <inject>
        time_key time
        tag_key tag
        localtime false
      </inject>
      <buffer>
        @type file
        path /var/log/fluentd-buffers/s3.buffer
        timekey 3600
        timekey_use_utc true
        chunk_limit_size 256m
      </buffer>
    </match>

kubectl create -f configmap.yaml

Now, we can go ahead and create daemonset:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.3.3-debian-s3-1.3
        env:
          - name: AWS_ACCESS_KEY_ID
            value: "#{OCI_ACCESS_KEY}"
          - name: AWS_SECRET_ACCESS_KEY
            value: "#{OCI_ACCESS_SECRET}"
          - name: S3_BUCKET_NAME
            value: "#{BUCKET_NAME}"
          - name: S3_BUCKET_REGION
            value: "#{OCI_REGION}"
          - name: S3_ENDPOINT
            value: "#{OBJECT_STORAGE_END_POINT}"
          - name: FLUENT_UID
            value: "0"
          - name: FLUENTD_CONF
            value: "override/fluent.conf"
          - name: FLUENTD_SYSTEMD_CONF
            value: "disable"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log/
        - name: u01data
          mountPath: /u01/data/docker/containers/
          readOnly: true
        - name: fluentconfig
          mountPath: /fluentd/etc/override/
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log/
      - name: u01data
        hostPath:
          path: /u01/data/docker/containers/
      - name: fluentconfig
        configMap:
          name: fluent-config

Couple of things to note:

We provide a custom fluent.conf via config map and mount it in daemonset. This is required to provide explicit s3_endpoint since image by default doesn’t have a way to provide custom s3_endpoint
Following are the env variables we need to configure. S3_BUCKET_REGION is the oci region e.g. us-ashburn-1. S3_ENDPOINT is the object storage endpoint e.g. https://#{tenantname}.compat.objectstorage.us-ashburn-1.oraclecloud.com. AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are the customer secret keys for your user. If not already present, refer to doc to generate. S3_BUCKET_NAME is the object storage bucket to store logs.

Let’s go ahead and create daemonset:

kubectl create -f daemonset.yaml

Once configured, you should be able to see logs in object storage. If bucket doesn’t exist, it will create it.

Building custom image in kubernetes cluster with kaniko

Ankit Bansal — Wed, 16 Jan 2019 05:35:00 +0000

One of the challenge we faced while migrating infrastructure from custom orchestration engine to Kubernetes was how to build image. As part of our own orchestration engine, we used to build image on the fly and publish it to registry. Since, kubernetes requires image to be already built and published, this brought a need of another layer on top of it that can build images and publish to registry.

When working within your local environment, this issue doesn't crop up often since you will usually have a docker daemon locally to build and publish image. However, when working as a platform, it won't work since there can be multiple people building image concurrently and thus you will need to think about the whole infrastructure needed for building image e.g. servers requirement, autoscaling, security etc. This is something we wanted to avoid and searching for various options, we came across kaniko which provided capability to build image with in the cluster.

Though this was one of the usecase, there are other scenarios where folks would like to take this route and avoid setting up docker on their local machines. In this article, will discuss step by step process to build and publish image using kaniko.

Creating Build Context

To build image using kaniko, first step is to create build context and publish it to one of the compatible storage. Build context contains your source code, dockerfile etc. Essentially it is the source folder you use to run “docker build” command. You need to zip it and publish it to compatible storage. At the time of writing this, kaniko only had support for GCS/S3 storages.

Unfortunately, we were on openstack, so an additional step was to create a kubernetes pod that downloads application zip from openstack, and create a context file including our dockerfile and application zip and finally pushes it to S3.

Setting up credentials

There are two credentials required for image creation. One is to download context from storage and other is to push image in docker registry. For AWS, you can create credentials file using your key id and access. This file can then be used to create secret.

[default]
aws_access_key_id=SECRET_ID
aws_secret_access_key=SECRET_TOKEN

kubectl create secret generic aws-secret –from-file=credentials

Other credential needed is to push image file in the registry. Any registry can be used to publish image, we were using dockerhub for this. Create a config file containing base64 encoded credentials and use it to create configmap.

{
    "auths": {
        "https://index.docker.io/v1/": {
            "auth": "BASE_64_AUTH"
        },
    "HttpHeaders": {
        "User-Agent": "Docker-Client/18.06.1-ce (linux)"
    }
    }
}

kubectl create configmap docker-config –from-file=config.json

Creating deployment to publish image

Kaniko doesn’t use docker daemon to create image. Instead it has its own mechanism of reading dockerfile line by line and creating snapshot once done. It has it’s own published image in gcr.io with name executor to accomplish this.

Since it’s a one time activity to build and publish image, we create kubernetes job to accomplish the same. You need to mount aws-secret and docker-config for authentication purpose. There are two environment variables needed: AWS_REGION to provide region name in which context is present and DOCKER_CONFIG to specify docker credentials path. Kaniko will ensure to ignore these folders while creating snapshot.

apiVersion: batch/v1
kind: Job
metadata:
  name: image-publisher
spec:
  template:
    spec:
      containers:
      - name: image-publisher
        image: gcr.io/kaniko-project/executor:latest
        args: ["--dockerfile=dockerfile",
               "--context=s3://imagetestbucket123/context.tar.gz",
               "--destination=index.docker.io/ankitbansal/httpserver:1.0"]
        volumeMounts:
        - name: aws-secret
          mountPath: /root/.aws/
        - name: docker-config
          mountPath: /kaniko/.docker/
        env:
        - name: AWS_REGION
          value: us-east-2
        - name: DOCKER_CONFIG
          value: /kaniko/.docker
      restartPolicy: Never
      volumes:
      - name: aws-secret
        secret:
          secretName: aws-secret1
      - name: docker-config
        configMap:
          name: docker-config
  backoffLimit: 2

kubectl create -f job.yaml

You can tail pod logs and see the image getting created and published:

INFO[0000] Downloading base image ruby:2.4.5-jessie     
INFO[0002] Unpacking rootfs as cmd RUN mkdir -p /u01/app/ && mkdir -p /u01/data/ && mkdir -p /u01/logs/ && groupadd myuser && groupadd builds && useradd -m -b /home -g myuser -G builds myuser && chown -R myuser:myuser /u01/ && chgrp -hR builds /usr/local requires it. 
INFO[0020] Taking snapshot of full filesystem...        
INFO[0033] Skipping paths under /kaniko, as it is a whitelisted directory 
INFO[0033] Skipping paths under /root/.aws, as it is a whitelisted directory 
INFO[0033] Skipping paths under /var/run, as it is a whitelisted directory 
INFO[0033] Skipping paths under /dev, as it is a whitelisted directory 
INFO[0033] Skipping paths under /proc, as it is a whitelisted directory 
INFO[0033] Skipping paths under /sys, as it is a whitelisted directory 
INFO[0038] ENV APP_HOME=/u01/app                        
INFO[0038] WORKDIR /u01/app                             
INFO[0038] cmd: workdir                                 
INFO[0038] Changed working directory to /u01/app        
INFO[0038] Creating directory /u01/app                  
INFO[0038] Taking snapshot of files...                  
INFO[0038] EXPOSE 8080                                  
INFO[0038] cmd: EXPOSE                                  
INFO[0038] Adding exposed port: 8080/tcp                
INFO[0038] RUN mkdir -p /u01/app/ && mkdir -p /u01/data/ && mkdir -p /u01/logs/ && groupadd myuser && groupadd builds && useradd -m -b /home -g myuser -G builds myuser && chown -R myuser:myuser /u01/ && chgrp -hR builds /usr/local 
INFO[0038] cmd: /bin/sh                                 
INFO[0038] args: [-c mkdir -p /u01/app/ && mkdir -p /u01/data/ && mkdir -p /u01/logs/ && groupadd myuser && groupadd builds && useradd -m -b /home -g myuser -G builds myuser && chown -R myuser:myuser /u01/ && chgrp -hR builds /usr/local] 
INFO[0039] Taking snapshot of full filesystem...        
INFO[0050] Skipping paths under /kaniko, as it is a whitelisted directory 
INFO[0050] Skipping paths under /root/.aws, as it is a whitelisted directory 
INFO[0050] Skipping paths under /var/run, as it is a whitelisted directory 
INFO[0051] Skipping paths under /dev, as it is a whitelisted directory 
INFO[0051] Skipping paths under /proc, as it is a whitelisted directory 
INFO[0051] Skipping paths under /sys, as it is a whitelisted directory 
INFO[0056] Using files from context: [/kaniko/buildcontext/appcontent] 
INFO[0056] ADD appcontent/ /u01/app                     
INFO[0056] Taking snapshot of files...                  
INFO[0056] USER myuser                                   
INFO[0056] cmd: USER                                    
2019/05/12 03:56:32 existing blob: sha256:053381643ee38d023c962f8789bb89be21aca864723989f7f69add5f56bd0472
2019/05/12 03:56:32 existing blob: sha256:e0ac5d162547af1273e1dc1293be3820a8c5b3f8e720d0d1d2edc969456f41aa
2019/05/12 03:56:32 existing blob: sha256:09e4a5c080c5192d682a688c786bffc1702566a0e5127262966fdb3f8c64ef45
2019/05/12 03:56:32 existing blob: sha256:14af2901e14150c042e83f3a47375b29b39f7bc31d8c49ad8d4fa582f4eb0627
2019/05/12 03:56:32 existing blob: sha256:6cc848917b0a4c37d6f00a2db476e407c6b36ce371a07e421e1b3b943ed64cba
2019/05/12 03:56:32 existing blob: sha256:62fe5b9a5ae4df86ade5163499bec6552c354611960eabfc7f1391f9e9f57945
2019/05/12 03:56:32 existing blob: sha256:bf295113f40dde5826c75de78b0aaa190302b3b467a3d6a3f222498b0ad1cea3
2019/05/12 03:56:33 pushed blob sha256:7baebbfb1ec4f9ab9d5998eefc78ebdfc063b9547df4395049c5f8a2a359ee20
2019/05/12 03:56:33 pushed blob sha256:6850b912246a34581de92e13238ac41c3389c136d25155d6cbe1c706baf3bc0e
2019/05/12 03:56:33 pushed blob sha256:0f9697e63b4482512d41f803b518ba3fb97bde20b21bec23c14bccc15f89e9f7
2019/05/12 03:56:42 pushed blob sha256:90e84d259def7af68e106b314e669056cb029d7a5d754d85cf0388419a5f2bcd
2019/05/12 03:56:43 index.docker.io/ankitbansal/httpserver:1.0: digest: sha256:13c218bb98623701eb6fd982b49bc3f90791695ce4306530c75b2094a8fdd468 size: 1896

Once done, yo should be able to verify image in registry.

Using image to deploy app

That’s it. Now you should be able to use this image for creating deployment and verify that image is working fine.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rubyapp-deployment
spec:
  replicas: 1 # tells deployment to run 2 pods matching the template
  selector:
    matchLabels:
      app: rubyapp1
  template: # create pods using pod definition in this template
    metadata:
      # unlike pod-nginx.yaml, the name is not included in the meta data as a unique name is
      # generated from the deployment name
      labels:
        app: rubyapp1
    spec:
      containers:
      - name: httpserverruby
        image: ankitbansal/httpserver:1.0
        imagePullPolicy: Always
        command: [ruby]
        args: ["httpserver.rb"]
        ports:
        - containerPort: 8080
        resources:
          limits:
            memory: "1Gi"
          requests:
            memory: "1Gi"
        env:
        - name: APP_HOME
          value: "/u01/app"
        - name: PORT
          value: "8080"

and verify response using curl

Why not simply use kubernetes docker daemon

Since, kubernetes already have docker daemon running in its node, one of the question arises why not use the same docker daemon to build image. This is not a great option to go for since it requires container to be priviliged one. A privileged container has all the capabilities of the container and it no longer has any limitations set by cgroup. Essentially it can do whatever host can do. This can pose a serious threat.

Multi-tenancy in kubernetes cluster

Ankit Bansal — Fri, 23 Nov 2018 06:15:25 +0000

The adoption of containerization and kubernetes is increasing rapidly. Fast provisioning, lightweight, auto scaling, serverless architecture are some of the major benefits of running your application in kubernetes cluster. However, they also bring up new problems to solve. One of the issue is since resources are shared, how to ensure fair utilization of resources and avoid one compromised tenant impacting other. The other and far more concerning issue is security and how to achieve secure isolation of resources between tenants.

What is Multi-tenancy

Multi-tenancy is an architecture paradigm where multiple customers are supported with single instance of application. Each customer is called as tenant of application and can effectively be an individual or an organization. Multi-tenancy is a very compelling value proposition when your system is supporting large number of customers as it avoids maintaining systems individually. To avoid one tenant impacting other, well defined isolation of resource is provided for each tenant. In a multi-tenant cluster, multiple applications are deployed from different tenants. It is the responsibility of the provider to ensure tenants are isolated from each other.

We faced this issue while migrating our platform to kubernetes. A Kubernetes cluster consists of various layers of resources eg. node, namespace, pod and container and thus isolation can be achieved at multiple levels. Default isolation suggested for kubernetes is to separate out each tenant in a different namespace. However, it does bring various important security considerations. Being a multi-tenant platform, it was critical for our business to provide security and isolation. After much consideration, we decided to go for node level isolation. In this article, I will discuss namespace level isolation and node level isolation with their pros and cons.

Namespace based isolation

Most of the kubernetes resources are in a namespace. If you don’t specify namespace for your resource, it will go into default namespace. Namespace is more of a logical entity to represent and manage cluster resources. You can think of it as a virtual cluster within a cluster itself. Namespace based isolation is one approach that can be used to achieve multi-tenancy. The idea is to have each tenant running in a different namespace.

You can create any number of namespaces within your cluster. A namespace can be created like this:

{
  "apiVersion": "v1",
  "kind": "Namespace",
  "metadata": {
    "name": "tenant1",
    "labels": {
      "name": "tenant1"
    }
  }
}

kubectl create -f namespace.yaml

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant1-role
  namespace: tenant1
rules:
  - apiGroups: [""]
    resources: ["pods", "secrets"]
    verbs: ["get", "list", "watch"]

kubectl create -f role.yaml

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: tenant1-role-binding
  namespace: tenant1
subjects:
  - kind: ServiceAccount
    name: tenant1-account
    namespace: tenat1
roleRef:
  kind: Role
  name: tenant1-role
  apiGroup: rbac.authorization.k8s.io

kubectl create -f rolebinding.yaml

Next, you need to create network policy to ensure cross-namespace communication is blocked. By default traffic is allowed between pods with in the cluster, however, you can define policy to block all traffic and then enable explicit communications. Note support of these policies depend on the network plugin used by cluster provider.

This policy will block all the ingress traffic to the pod.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
spec:
  podSelector: {}
  policyTypes:
  - Ingress

kubectl create -f defaultpolicy.yaml

Then you can explicitly specify another network policy to allow traffic within the namespace. Note network policies are additive in nature.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: within-namespace
  namespace: tenant1
spec:
  podSelector:
  ingress:
    - from:
      - NamespaceSelector:
          matchLabels:
            name: tenant1

kubectl create -f namespacepolicy.yaml

You can specify egress policies in the same manner.

By doing all this, we ensure that our resources are separated out for tenant. However there are still some issues. What if one tenant tries to create too many resources, it can lead to other tenants being slowed down or not having enough compute available. Kubernetes support specifying limit ranges and resource quota that can be applied per container or across namespace. For example, this resource quota will ensure that within the namespace, total usage cannot go beyond specified limit.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: mem-cpu-demo
spec:
  hard:
    requests.cpu: "1"
    requests.memory: 1Gi
    limits.cpu: "2"
    limits.memory: 2Gi

kubectl apply -f resourcequota.yaml –namespace=tenant1

Is this enough?

The question arises whether doing all this is enough and can be relied on. The fact is kernel is still shared and there can be some vulnerabilities that can lead to user getting access to the node or the containers with in the node. You can look at some of the recent vulnerabilities in docker. This can cause significant impact to the business and should be definitely avoided to have a multi-tenant platform. There are pod security policies that can achieve some of the isolations and securities however they don’t seem enough to counter this. If you are planning to run your own code in cluster rather than custom code or if you are planning to support internal system or multiple teams with in a customer org, you can choose this strategy. That wasn’t the case for us. So, we decided to come up with node level isolation.

Node based isolation

Not all resources in kubernetes are bound within a namespace. Low-level resources such as nodes and persisteny volumes are shared across namespace. The idea is to ensure that pods of a tenant are scheduled on different nodes. This ensure that kernel is shared by containers of same tenant and volume mounts and host kernel is no longer at risk. In this case, we label node with the tenant info using below command.

kubectl label nodes worker1 tenant=tenant1

Now, you can specify that pod should be scheduled to nodes with this label only.

apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
  - name: some-container
    image: customImage
    imagePullPolicy: Always
  nodeSelector:
    tenant: tenant1

This way you can ensure that your application gets scheduled on the desired node. Additionally, you will need to create additional infrastructure to keep a watch on nodes associated in kubernetes and automatically identify when new nodes are needed to be pooled in. This solution does achieve multi-tenancy however it can lead to lower utilization of resources. We found this approach to suit our purpose given the lack of proper solution of this in kubernetes today. There is still one issue inspite of all this that master nodes are shared. In our case, we were not having much option over there since we were using hosted cluster. However, I think kubernetes should provide more isolation going forward in this regard. However, this served our purpose well.

Why not have cluster per tenant

Cluster per tenant is not really an option comparable to these two. However, there are some instances when you have only few tenants or few teams to manage and you find creating separate cluster for each of these. If this strategy works for you, by all means go for it. This will avoid lot of headache you will have in the previously described scenrios.

The road ahead

As can be seen, both of the issues have their pros and cons. Though namespace isolation provide better utilization of resources, it does lead to potential security issues. On the other hand, node isolation does lead to unefficient resource utilization. There seems a gap in multi-tenacy requirements and solution we have today and various other solutions are ramping up to fill it. Couple of solutions that looks promising are:

gVisor : gVisor is a user-space kernel built by google and used in many google projects. It implements substantial portion of linux system interface including file systems, pipes, signals etc. It also utilizes runsc utility to ensure isolation boundary between the application and the host kernel. Thus gVisor provides a lightweight alternative of virtual machines while having clear separation of resources. There is effort going on in making this work with kubernetes.

Kata Containers : Another interesting concept under development is kata containers. Kata containers are lightweight virtual machines which runs on a dedicated kernel providing isolation of network, memory, I/O etc. They are built on Open Container Initiative (OCI) standards and thus can be directly supported by kubernetes. However, this project is still in early development.