Forem: Amar Chand

Sharding the Clusters across Argo CD Application Controller Replicas

Amar Chand — Wed, 04 Oct 2023 10:26:56 +0000

Argo CD is an open-source GitOps continuous delivery tool, which helps to automate the deployment of applications to Kubernetes clusters. With growing GitOps and Kubernetes adoption, Argo CD has emerged as one of the most popular choices in the GitOps ecosystem. This is one of the blog posts, where we dwell into different Argo CD related issues that we observed as part of our Argo CD enterprise support offering to our various customers.

In this blog post, we will be diving deep into a specific problem that may occur with your Argo CD setup in case you’re using it to manage multiple clusters. But before we jump into the specific problem statement, let's quickly examine what Argo CD comprises internally.

Components of Argo CD

Argo CD comprises various components and each one has its own set of actions. You can see how the typical Argo CD’s architecture looks in the following diagram:

(Image Source: Argo CD Architecture)

There are primarily three components of Argo CD, as visible in the above diagram:

API
Repository Service (also known as Repo Server)
Application Controller

Out of the above three, the components of interest for this particular blog post are Repository Service and Application Controller.

Repository Service (aka Repo Server)

The Repo server maintains the connection to the Git repositories where application manifests are stored. It listens to changes in the Git repositories and caches the latest changes. It is also responsible for generating Kubernetes manifests from the given application specification.

Application Controller

Application Controller compares the live state (what is running in the cluster) and desired/target state (what is in the repo). If there is any difference between the live and desired state, it can optionally synchronize the live state to the desired/target state, which involves deploying, updating, or removing resources as necessary.

There are many cases when you might want to consider scaling the Argo CD Application Controller, such as:

High number of applications and resources
Complex application dependencies
Frequent updates
Large cluster size
Network latency, or connectivity issues between the Argo CD Application Controller and managed clusters
High availability requirements

When an Argo CD Application Controller statefulset is scaled, from a user’s perspective it is expected that the Kubernetes clusters will be sharded across the replicas of the Application Controller uniformly. However, that is not the case. Let’s look into this in detail in the next section of the post.

Problem Statement

Once we had a situation where one of our customers ran into the problem of the non-uniform sharding process of the Argo CD Application Controller. The customer brought forward a problem where they were facing a slow synchronization issue despite having multiple replicas of the Argo CD Application Controller running for their clusters.

First-hand, it seemed that the Application Controller was handling too many clusters earlier and was using too many resources. Hence, the customer went ahead and scaled up the Argo CD Application Controller statefulset. By doing so, it was expected that each replica of the Application Controller would focus on a subset of clusters, thus distributing the workload and memory usage. This process is known as sharding. Even Argo CD’s official documentation suggests to leverage sharding. However, the sharding mechanism of the Argo CD Application Controller does not provide much help.

When our Argo CD support engineers started looking deep into the problem, they found that some of the Argo CD Application Controller replicas were managing more clusters in comparison to other replicas and a couple of replicas were managing no clusters at all – which implies that increasing the number of replicas does not necessarily mean that your clusters will be sharded uniformly across the available replicas.

To find how the clusters are sharded, you can use the argocd command line utility. If it is not available, you can install it by following the Argo CD CLI installation steps. Once installed and connected to the Argo CD server, you can run the following command:

argocd admin cluster stats

This command will show the shard allocated to each of the clusters managed by the connected Argo CD instance. Following is a snippet of the output of the above command:

Note: The below snippet is not a complete snippet and its whole purpose is to understand how to infer the output of argocd admin cluster stats command.

SERVER                          SHARD  CONNECTION  NAMESPACES COUNT  APPS COUNT  RESOURCES COUNT
https://kubernetes.default.svc  0                  4                 65          217
<redacted>                      4                  4                 65          217
<redacted>                      4                  5                 73          228
<redacted>                      3                  4                 65          217
<redacted>                      0                  4                 65          217
<redacted>                      1                  5                 73          228
<redacted>                      3                  4                 65          217
<redacted>                      4                  4                 65          217
<redacted>                      4                  5                 73          228

In the above snippet, first column contains the server address of the particular Kubernetes cluster, and the second column contains the index of the Argo CD Application Controller replica that is incharge of maintaining the live state of the respective cluster. E.g. First cluster with Server address as https://kubernetes.default.svc is being maintained by Argo CD Application Controller's replica with index 0, or in other words, it is argocd-application-controller-0. Please note that all the replicas of Argo CD Application Controller have the index number as a suffix. So, it means that shard 0 means argocd-application-controller-0, shard 1 means argocd-application-controller-1 and so on.

If you look at the above snippet, you can see that four of the clusters are being handled by the argocd-application-controller-4 pod. argocd-application-controller-0 and argocd-application-controller-3 handles two clusters each, and argocd-application-controller-1 handles 1 cluster only.

Troubleshooting

As the first step of troubleshooting, our professional support engineers decided to analyze the Argo CD Application Controller’s logs. When checking the logs for further troubleshooting, they found the following log multiple times in all the replicas of the Argo CD Application Controller:

Note: Time can be different for different logs, as the log message was the same, we did not add more logs here.

time="2023-07-21T11:27:12Z" level=info msg="Ignoring cluster <cluster-server-address>"

When our team dug deeper into the issue looking for the sharding logic, they found that the sharding function has been written in such a way that it assigns the particular replica of the Argo CD Application Controller to manage a cluster, based on the UUID of the secret storing the cluster (considering we are not manually interfering with the sharding process).

Logic behind Sharding in Argo CD Application Controller

The following flow diagram depicts how the sharding logic works internally in the Argo CD codebase.

Note: The diagram depicts the sharding logic of Argo CD version < 2.8.0. With the release of Argo CD 2.8.0, this sharding logic is now known as the legacy sharding algorithm.

(Argo CD Application Controller Sharding Logic)

Solution for uniform cluster sharding across Argo CD Application Controller replicas

In the present day, there are two ways to handle such a scenario:

A. Using round-robin algorithm

B. Manually defining the shard

In our case, our team went ahead with Solution B, as that was the only solution present when the issue occurred. However, with the release of Argo CD 2.8.0 (released on August 7, 2023), things have changed - for the better :). Now, there are two ways to handle the sharding issue with the Argo CD Application Controller:

Solution A: Use the Round-Robin sharding algorithm (available only for Argo CD 2.8.0 and later releases)

An issue was raised on GitHub for the sharding algorithm of Argo CD Application Controller and that issue has been fixed in Argo CD 2.8.0 by pull request 13018.

It means that users can upgrade to 2.8.0 or any later version and configure the sharding algorithm to get rid of this issue. If you don't want to (or can't) upgrade to 2.8.0, you might want to go for Solution B.

However, it has to be noted that the new round-robin sharding algorithm is not the default sharding algorithm for the Argo CD Application Controller at the time of writing this blog post, it is still using the legacy sharding algorithm as the default one.

How to configure the Argo CD Application Controller to use a round-robin sharding algorithm?

For configuring the sharding algorithm in Argo CD 2.8.0 or later, we need to set controller.sharding.algorithm to round-robin in argocd-cmd-params-cm configmap. If you have installed Argo CD using manifest files, connect to the cluster on which Argo CD is running, update the namespace in the following command, and run the same:

kubectl patch configmap argocd-cmd-params-cm -n <argocd-namespace> --type merge -p '{"data":{"controller.sharding.algorithm":"round-robin"}}'

After updating the configmap successfully, roll out the restart of the Argo CD Application Controller statefulset using the following command:

kubectl rollout restart -n <argocd-namespace> statefulset argocd-application-controller

Now, to verify that the Argo CD Application Controller is using a round-robin sharding algorithm, run the following command:

kubectl exec -it argocd-application-controller-0 -- env | grep ARGOCD_CONTROLLER_SHARDING_ALGORITHM

The expected output should be:

ARGOCD_CONTROLLER_SHARDING_ALGORITHM=round-robin

In case you maintain Argo CD using Helm, then you can add controller.sharding.algorithm: "round-robin" key-value pair under .config.params in values file and install/upgrade the setup, to get the similar results.

In case you maintain Argo CD using Argo CD Operator, then you can add ARGOCD_CONTROLLER_SHARDING_ALGORITHM environment variable under controller in the ArgoCD resource specification and set its value to 'round-robin'. Make sure you have enabled the sharding for controller using Sharding.enabled flag under controller. Apply the configuration once the changes are done.

Solution B: Manually define the shard

This is a workaround in case the user doesn’t want to upgrade the running Argo CD instance or manually want to manage the sharding.

Define the shard for a new cluster

If you are adding a new cluster, mention the index of the application-controller replica which you require to manage the cluster, against the shard key, while defining the particular cluster secret. For example:

apiVersion: v1
kind: Secret
metadata:
  name: <secret-name>
  labels:
    argocd.argoproj.io/secret-type: cluster
  namespace: <secret-namespace>
type: Opaque
stringData:
  name: <cluster-name>
  server: <server-url>
  config: <configuration>
  shard: "<desired-application-controller-replica-index-here>"

The value of shard would be at the .stringData.shard location while entering the data. When you'll check the secret again, you can find the base64 encoded value of the shard key at .data.shard in the secret. Please note that the value of shard should be in string format, not in int format. You might want to use quotes for that.

If you want to add the cluster imperatively, mention the index of the application-controller replica which you require to manage the cluster, against the --shard argument. For example:

argocd cluster add < context-here > \
  --shard <desired-application-controller-replica-index-here>

Please note that you need to enter an int value if you are adding the cluster imperatively.

Update the shard for an existing cluster

In case you have an existing cluster for which you manually want to define the shard, then you will need to edit the particular cluster secret and add the following block:

stringData:
  shard: "<desired-application-controller-replica-index-here>"

The value of the shard would be at the .stringData.shard location while entering the data. When you'll check the secret again, you can find the base64 encoded value of the shard key at .data.shard in the secret. Please note that the value of shard should be in string format, not in int format. You might want to use quotes for that.

Once the sharding was done, the way different clusters were distributed evenly and efficiently managed by the Argo CD sharding process can be seen using the below graph:

(Argo CD Cluster Distribution)

Conclusion

So, we learned how to use different approaches to deal with the improper sharding mechanism of the Argo CD Application Controller. Though using the built-in solution (round-robin sharding algorithm) makes more sense generally, there are cases when you might want to leverage manual sharding. For example, if you have three clusters, where the first two clusters are running 400 applications each, but the third cluster is running 800 applications, it makes sense to share one shard between the first two clusters and dedicate one shard to the third cluster.

It is being discussed at the time of writing, that the round-robin sharding algorithm in Argo CD 2.8.0 is still having some problems with logging (it is generating too many logs), however, the change seems to be a step in the right direction and the issue is being worked upon right now. It should be fixed soon.

Note: Scaling the Application Controller should be done judiciously and should be aligned with the actual needs of your environment. Monitoring the performance and resource utilization of the Application Controller can help you make informed decisions about when and how to scale.

We at InfraCloud also help our customers with this kind of assessment and implement Argo CD to cater to their requirements well. If you are looking for help with GitOps adoption using Argo CD, do check our Argo CD consulting capabilities and expertise to know how we can help with your GitOps adoption journey. If you’re looking for managed on-demand Argo CD support, check our Argo CD support model.

I hope you found this post informative. For more posts like this one, subscribe to our weekly newsletter. I’d love to hear your thoughts on this post, so do start a conversation on LinkedIn :)

Additional Resources

References

Community Support

If you want to connect to the Argo CD community, please join CNCF Slack. You can join #argo-cd and many other channels too.

Getting Started with ArgoCD

Amar Chand — Tue, 07 Mar 2023 12:10:16 +0000

Overview

In this blog, we shall discuss what ArgoCD is, and how to install it.

ArgoCD

Quoting ArgoCD's website,

ArgoCD is a declarative, gitops Continuous Delivery tool for Kubernetes.

But what is declarative? What is GitOps? and What is Continuous Delivery? Let's understand them one by one, in reverse order.

Continuous Delivery

Continuous Delivery is a software engineering approach, in which code changes are automatically prepared for release in production.

Continuous Delivery is often aided by Continuous Integration. The process says that, once a developer pushes some code, Continuous Integration is leveraged to test the code. Once the code is tested and all the tests are passed, Continuous Delivery tools take the baton, and deploy the code into various environments (i.e. QA, Staging, etc).

GitOps

The term GitOps was coined in 2017 by Weaveworks. Since its inception, it has caught a lot of attention.

GitOps is a way of implementing Continuous Deployment for cloud native applications.

NOTE: Continuous Deployment is almost same to Continuous Delivery, the only difference is that application deployment in production environment requires Approval in Continuous Delivery, while Continuous Deployment directly deploys the code in the production as well.

The main idea behind GitOps is to have a Git repository, which contains the declarative description of the desired infrastructure for the particular environments (let's say production) and an automated process to keep the production environment in the desired state. ArgoCD is one such tool which is capable of doing this.

We will understand this more when we'll deploy our first application using ArgoCD.

Declarative

ArgoCD is declarative in nature. It means that whatever state we define in a manifest file will be achieved by ArgoCD. There is no need to define the procedure.

Installing ArgoCD

Pre-requisites

a kuebrnetes cluster (We shall be using Minikube as a primary cluster for further blogs in this series, to know how to install minikube, please refer to Minikube documentation).
kubectl utility to interact with the kubernetes cluster you have access to (To install kubectl utility, please refer to Kubernetes documentation).

Installation

There are multiple ways to install Argo CD. You can simply install it by applying plain manifests. You can also use helm chart to deploy it. In case you have an OCP/OKD cluster, then most probably you might want to go with Operator based installation. We are going to discuss the manifests way and helm way in this blog. For operator based installation, please refer the official Argo CD Operator documentation.

The most beginner-friendly way to install Argo CD is to directly deploy it using plain Kubernetes manifests. Run the following commands to do that:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

If you want to install Argo using Helm chart,then first add the Argo CD repository to your helm repositories list using the following command:

helm repo add argo https://argoproj.github.io/argo-helm

If you have already added the repository, please run the following command to fetch the information about updated versions:

helm repo update

Now that we have added the helm repository, run the following command to install Argo CD using helm:

helm install argocd argo/argo-cd --version 5.43.0 -n argocd --create-namespace

The above command will install Argo CD in argocd namespace. In the next blog post of this series, we will explore how to login, and we will also deploy our first application via Argo CD.

Securing Kubernetes Cluster using Kubescape and kube-bench

Amar Chand — Mon, 25 Jul 2022 09:47:15 +0000

With businesses adopting cloud native technology, Kubernetes has emerged as a primary tool of choice for container orchestration. Deploying and managing applications has never been easier. However, securing clusters has been much like uncharted waters with containers. Attackers find and exploit new ways to break into the systems while the community works round the clock to protect it.

To improve the security of clusters, one needs to understand what it is and how it works. For this, one needs a detailed analysis of the cluster including the file system where Kubernetes components' configurations are stored, line by line analysis of the artifacts, etc. Institutes like NSA, MITRE, CIS, etc. release benchmarks and keep upgrading them for maintaining the security of Kubernetes clusters. However, these benchmarks cover so many details that it becomes a very lengthy process to check things manually.

While exploring how to set up vulnerability assessment scans for the Kubernetes clusters, we came across two tools: kube-bench and Kubescape.

In this blog post, we shall discuss open source offerings of these tools, what their capabilities are, how they work, which frameworks they use, when to use them and why, and how they complement each other. So, let's get started.

What is kube-bench?

kube-bench is a tool from Aqua Security. It is an open source offering that analyzes the cluster against Centre for Internet Security guidelines.

How does kube-bench work?

kube-bench is a tool that doesn't run continuously on your cluster. Rather, one can run it on all the nodes using simple commands. The test is divided in different sections, such as:

Master Node Security Configuration
etcd Node Configuration
Control Plane Configuration
Worker Node Security Configuration
Kubernetes Policies

Every section publishes its own tests, remediations for the tests that are failing or in warning, and its summary (count of PASS/FAIL/WARN/INFO checks). At the end, an overall summary is published. Following are some small snippets of output of the kube-bench scan on a minikube cluster:

Checks Example

[INFO] 1 Master Node Security Configuration
[INFO] 1.1 Master Node Configuration Files
[FAIL] 1.1.1 Ensure that the API server pod specification file permissions are set to 644 or more restrictive (Automated)
[FAIL] 1.1.2 Ensure that the API server pod specification file ownership is set to root:root (Automated)
[FAIL] 1.1.3 Ensure that the controller manager pod specification file permissions are set to 644 or more restrictive (Automated)
[FAIL] 1.1.4 Ensure that the controller manager pod specification file ownership is set to root:root (Automated)
[FAIL] 1.1.5 Ensure that the scheduler pod specification file permissions are set to 644 or more restrictive (Automated)

[INFO] 1.2 API Server
[WARN] 1.2.1 Ensure that the --anonymous-auth argument is set to false (Manual)
[PASS] 1.2.2 Ensure that the --token-auth-file parameter is not set (Automated)
[PASS] 1.2.3 Ensure that the --kubelet-https argument is set to true (Automated)
[PASS] 1.2.4 Ensure that the --kubelet-client-certificate and --kubelet-client-key arguments are set as appropriate (Automated)
[FAIL] 1.2.5 Ensure that the --kubelet-certificate-authority argument is set as appropriate (Automated)

Remediations Example

1.1.1 Run the below command (based on the file location on your system) on the
master node.
For example, chmod 644 /etc/kubernetes/manifests/kube-apiserver.yaml
1.1.2 Run the below command (based on the file location on your system) on the master node.
For example,
chown root:root /etc/kubernetes/manifests/kube-apiserver.yaml
1.1.3 Run the below command (based on the file location on your system) on the master node.
For example,
chmod 644 /etc/kubernetes/manifests/kube-controller-manager.yaml
1.1.4 Run the below command (based on the file location on your system) on the master node.
For example,
chown root:root /etc/kubernetes/manifests/kube-controller-manager.yaml
1.1.5 Run the below command (based on the file location on your system) on the master node.
For example,
chmod 644 /etc/kubernetes/manifests/kube-scheduler.yaml

1.2.1 Edit the API server pod specification file /etc/kubernetes/manifests/kube-apiserver.yaml
on the master node and set the below parameter.
--anonymous-auth=false
1.2.5 Follow the Kubernetes documentation and setup the TLS connection between
the apiserver and kubelets. Then, edit the API server pod specification file
/etc/kubernetes/manifests/kube-apiserver.yaml on the master node and set the
--kubelet-certificate-authority parameter to the path to the cert file for the certificate authority.
--kubelet-certificate-authority=<ca-string>

Summary Example

24 checks PASS
27 checks FAIL
13 checks WARN
0 checks INFO

Deployment methods

kube-bench can be executed as a simple command on the host, as a container on the host using Docker command, or as a job inside Kubernetes Cluster. In case it is run inside a container/pod, it will need access to the PID namespace of the host system. The methods to run kube-bench in AKS, EKS, GKE, On-prem cluster, Openshift and ACK (Alibaba Cloud Container Service For Kubernetes) are different but well documented.

When to use kube-bench?

kube-bench's analysis is great when it scans nodes (master node, worker node, etcd node). It gives very precise instructions regarding ownership and permissions for configuration files as well as for flags and arguments that are wrongly configured. It also gives commands directly wherever applicable. However, we experienced that the outputs were more of guidelines when it came to scanning artifacts inside the cluster. There was no specific information about which artifact had misconfiguration. Following are some of the examples of checks and remediation under the Kubernetes Policies section:

Checks

[INFO] 5 Kubernetes Policies
[INFO] 5.1 RBAC and Service Accounts
[WARN] 5.1.1 Ensure that the cluster-admin role is only used where required (Manual)
[WARN] 5.1.2 Minimize access to secrets (Manual)
[WARN] 5.1.3 Minimize wildcard use in Roles and ClusterRoles (Manual)

[INFO] 5.2 Pod Security Policies
[WARN] 5.2.1 Minimize the admission of privileged containers (Automated)
[WARN] 5.2.2 Minimize the admission of containers wishing to share the host process ID namespace (Automated)
[WARN] 5.2.3 Minimize the admission of containers wishing to share the host IPC namespace (Automated)
[WARN] 5.2.4 Minimize the admission of containers wishing to share the host network namespace (Automated)
[WARN] 5.2.5 Minimize the admission of containers with allowPrivilegeEscalation (Automated)

Remediations

5.1.1 Identify all clusterrolebindings to the cluster-admin role. Check if they are used and if they need this role or if they could use a role with fewer privileges.
Where possible, first bind users to a lower privileged role and then remove the clusterrolebinding to the cluster-admin role :
kubectl delete clusterrolebinding [name]
5.1.2 Where possible, remove get, list and watch access to secret objects in the cluster.
5.1.3 Where possible replace any use of wildcards in clusterroles and roles with specific objects or actions.

5.2.1 Create a PSP as described in the Kubernetes documentation, ensuring that the .spec.privileged field is omitted or set to false.
5.2.2 Create a PSP as described in the Kubernetes documentation, ensuring that the .spec.hostPID field is omitted or set to false.
5.2.3 Create a PSP as described in the Kubernetes documentation, ensuring that the .spec.hostIPC field is omitted or set to false.
5.2.4 Create a PSP as described in the Kubernetes documentation, ensuring that the .spec.hostNetwork field is omitted or set to false.
5.2.5 Create a PSP as described in the Kubernetes documentation, ensuring that the .spec.allowPrivilegeEscalation field is omitted or set to false.

Such outputs don’t give a clear picture about the cluster. For instance, the above output does not provide any information about the specific fields/clusterrolebindings which violate the security controls. And if your cluster is large, then this kind of information does not help much.

Integrations with other tools

At the time of writing this blog, kube-bench does not offer any native integration with other tools. However, AWS Security Hub has added it as an open source tool integration. Here are more details on kube-bench integrations with other tools. Apart from this, kube-bench also provides an output of the scan in JSON format, so that if you want to make reports or create alerts on the basis of cluster scan results, you can create a script around it.

So, this was all about kube-bench. As we saw above, it is great when we want to secure the cluster from the nodes' end. However, it does not provide pinpoint information when it comes to checking vulnerabilities in Kubernetes artifacts' configurations. These can be very well covered using the other tool that we are about to discuss and has grown popular recently, called Kubescape.

What is Kubescape?

Kubescape is a tool from ARMO Security. Its open source offering analyzes the cluster against NSA and MITRE guidelines. Apart from these two, Armo themselves have developed two security frameworks for Kubernetes, named ArmoBest and DevOpsBest, which work with Kubescape.

How does Kubescape work?

Kubescape has capabilities to run inside your cluster as well as in a CI/CD pipeline. This flexibility allows you to keep a constant check on your clusters as well as CI/CD pipelines.

Unlike kube-bench, Kubescape's tests are not divided into sections. Rather, Kubescape uses controls. In Kubescape's ecosystem, NSA/MITRE/ArmoBest/DevOpsBest guidelines are broken into small sets of policies (known as controls). Each control has its own set of rules against which the cluster or pipeline is scanned. Using the web interface, you can also create your own framework to use with Kubescape by combining the controls provided on the portal. Once the configuration is scanned, it sends the details to the ARMO's portal. You can also see the security posture of your cluster/pipeline from the web interface itself. A major difference between kube-bench and Kubescape is that Kubescape goes into specific details, when it comes to check Kubernetes artifacts. On the portal, Kubescape navigates you exactly to the line in a particular artifact/s configuration due to which a control is failing (example has been shared in the image below):

If you do not wish to use ARMO's portal, you can simply scan your cluster/pipeline. The issue with that is you don't get to schedule your scans natively from Kubescape. However, you can use utilities like cron for that. Following are some examples of CLI output:

Controls check example

[control: Naked PODs - https://hub.armo.cloud/docs/c-0073] failed 😥
Description: It is not recommended to create PODs without parental Deployment, ReplicaSet, StatefulSet etc.Manual creation if PODs may lead to a configuration drift and other untracked changes in the system. Such PODs won't be automatically rescheduled by Kubernetes in case of a crash or infrastructure failure. This control identifies every POD that does not have a corresponding parental object.
Failed:
 Namespace default
   Pod - bus
 Namespace kube-system
   Pod - storage-provisioner
Summary - Passed:22   Excluded:0   Failed:2   Total:24
Remediation: Create necessary Deployment object for every POD making any POD a first class citizen in your IaC architecture.

[control: Enforce Kubelet client TLS authentication - https://hub.armo.cloud/docs/c-0070] passed 👍
Description: Kubelets are the node level orchestrator in Kubernetes control plane. They are publishing service port 10250 where they accept commands from API server. Operator must make sure that only API server is allowed to submit commands to Kubelet. This is done through client certificate verification, must configure Kubelet with client CA file to use for this purpose.
Summary - Passed:2   Excluded:0   Failed:0   Total:2

Summary Example

FRAMEWORKS: DevOpsBest (risk: 43.94), MITRE (risk: 15.93), ArmoBest (risk: 27.62), NSA (risk: 30.72)
+-----------------------------------------------------------------------+------------------+--------------------+---------------+--------------+
|                             CONTROL NAME                              | FAILED RESOURCES | EXCLUDED RESOURCES | ALL RESOURCES | % RISK-SCORE |
+-----------------------------------------------------------------------+------------------+--------------------+---------------+--------------+
| Access Kubernetes dashboard                                           |        0         |         0          |      98       |      0%      |
| Access container service account                                      |        41        |         0          |      45       |     91%      |
| Access tiller endpoint                                                |        0         |         0          |       0       |   skipped    |
| Allow privilege escalation                                            |        24        |         0          |      25       |     96%      |
| Allowed hostPath                                                      |        4         |         0          |      25       |     16%      |
.
.
.
.
.
+-----------------------------------------------------------------------+------------------+--------------------+---------------+--------------+
|                           RESOURCE SUMMARY                            |       131        |         0          |      185      |    28.35%    |
+-----------------------------------------------------------------------+------------------+--------------------+---------------+--------------+

Deployment methods

Kubescape can be deployed on any Kubernetes cluster for routine check-ups, as well as in the CI/CD pipeline to ensure that no misconfiguration can make its way to production. It can be run on any machine, given that the kubeconfig file to access the cluster should be present on the machine.

One can install it or run it using a simple set of commands that are available on ARMO's portal. Once you sign-up on ARMO's portal, you get an account ID. You also get a set of commands containing this account ID so that all your clusters or CI/CD scans can show up on one single page. The following image shows how do those commands look like:

If you want to run Kubescape inside an air-gapped Kubernetes cluster, then you can install Kubescape utility from Kubescape's Github repository and follow the instructions under Offline/Air-gaped Environment Support section present on Kubescape's Github repository.

Where it is best to use?

Kubescape can work efficiently on your regular cluster as well as ephemeral clusters (ones created for CI/CD checkup). Kubescape shines when it comes to the configuration of artifacts inside the cluster (in other words, Kubernetes Objects). The reason behind this is the detailed analysis available on ARMO's portal for every check that gets failed. On ARMO's portal, you get the issue drilled down to the single line in your configuration due to which a control is failing.

Integrations

Kubescape natively provides integration with Prometheus, Slack, Jenkins, CircleCI, Github, GitLab, Azure-DevOps, GCP-GKE, AWS-EKS etc.. The steps for integration are well documented at both ARMO's official docs and Integrations page on ARMO's portal.

Conclusion

Both Kubescape and kube-bench are different in terms of what frameworks they support, how they are deployed, and the way they perform scans and provide results. It is better to say that both have their own strong areas. kube-bench proves its mettle when it comes to scanning the host, file permissions and ownership, flags for different Kubernetes control plane components. On the other hand, Kubescape shows its worth when it comes to scanning the objects inside the cluster, such as pods, namespaces, accounts, etc.. Keep in mind that ARMO's portal is a hosted solution, and for using it, you will have to share information about in-cluster resources with it via Kubescape. However, as we discussed above, you can also use Kubescape in CLI only mode (as mentioned under Offline/Air-gaped Environment Support section in Kubescape's GitHub repository).

To summarize, I believe both kube-bench and Kubescape complement each other. kube-bench should be used while setting up the cluster or adding up a new host in the cluster, as files permissions and ownership types of things are one-time tasks and it is very important to save the cluster's configuration from unauthorized access. Once the cluster/new host is up and running, Kubescape could be used for regular scans of artifacts inside the cluster as it drills down the issue to the single line of configuration.

I hope you found this post informative and engaging. For more posts like this one, do subscribe to our weekly newsletter. I’d love to hear your thoughts on this post, so do start a conversation on LinkedIn :)

Looking for help with cloud native security? do check out our capabilities how we’re helping startups & enterprises as an DevSecOps consulting services provider.