<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mohammed Kashif Rafi</title>
    <description>The latest articles on Forem by Mohammed Kashif Rafi (@kashifrafi).</description>
    <link>https://forem.com/kashifrafi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2575086%2F67de27fd-e39c-4e34-8260-a00b75ec66f3.jpg</url>
      <title>Forem: Mohammed Kashif Rafi</title>
      <link>https://forem.com/kashifrafi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kashifrafi"/>
    <language>en</language>
    <item>
      <title>KAAR AI-Powered Kubernetes Cluster Analysis and Remediation</title>
      <dc:creator>Mohammed Kashif Rafi</dc:creator>
      <pubDate>Fri, 23 May 2025 03:12:45 +0000</pubDate>
      <link>https://forem.com/kashifrafi/kaar-ai-powered-kubernetes-cluster-analysis-and-remediation-2gpn</link>
      <guid>https://forem.com/kashifrafi/kaar-ai-powered-kubernetes-cluster-analysis-and-remediation-2gpn</guid>
      <description>&lt;h1&gt;
  
  
  Introducing KAAR AI-Powered Kubernetes Cluster Analysis and Remediation
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;KAAR&lt;/strong&gt; (Kubernetes AI-powered Analysis and Remediation), a tool that automates Kubernetes Pod issue detection and resolution using &lt;strong&gt;k8sgpt&lt;/strong&gt; and &lt;strong&gt;AWS Bedrock&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;In this post, I’ll introduce KAAR, explain its value, and guide you on getting started with it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Kubernetes Pod Challenge
&lt;/h2&gt;

&lt;p&gt;Pods are the core of Kubernetes, but they often encounter issues that halt your applications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ImagePullBackOff&lt;/strong&gt;: Pods fail to start due to invalid or inaccessible container images.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CrashLoopBackOff&lt;/strong&gt;: Containers crash repeatedly from misconfigured commands or errors like “executable file not found.”
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OOMKilled&lt;/strong&gt;: Pods are terminated for exceeding memory limits.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pending&lt;/strong&gt;: Pods can’t schedule due to resource constraints or node affinity issues.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Manually debugging these with &lt;code&gt;kubectl describe&lt;/code&gt; and &lt;code&gt;logs&lt;/code&gt; is time-consuming, especially in large clusters. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;KAAR automates&lt;/strong&gt; this process, leveraging AI to resolve Pod issues quickly and reliably, saving valuable time for DevOps teams.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is KAAR?
&lt;/h2&gt;

&lt;p&gt;KAAR is an open-source tool designed to simplify Kubernetes Pod management. It integrates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;k8sgpt&lt;/strong&gt;: A diagnostic tool that scans Kubernetes clusters for Pod issues.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS Bedrock&lt;/strong&gt;: Uses the Claude v2 model to classify issues and recommend fixes.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS SNS and CloudWatch&lt;/strong&gt;: Sends notifications and logs results for team visibility.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;kubectl&lt;/strong&gt;: Applies automated fixes to get Pods back on track.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;KAAR is perfect for DevOps engineers, SREs, and architects who want to minimize manual intervention. It currently focuses on Pod remediation, with plans to support &lt;strong&gt;Services&lt;/strong&gt; and &lt;strong&gt;Deployments&lt;/strong&gt; in future releases.&lt;/p&gt;




&lt;h2&gt;
  
  
  How KAAR Works
&lt;/h2&gt;

&lt;p&gt;KAAR follows a streamlined, AI-powered workflow to fix Pod issues:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cluster Scanning&lt;/strong&gt;: KAAR uses k8sgpt to analyze your cluster and identify Pod issues.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Issue Classification&lt;/strong&gt;: Findings are processed by AWS Bedrock to determine issue type.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remediation&lt;/strong&gt;: Applies fixes via &lt;code&gt;kubectl&lt;/code&gt;, like updating images or adjusting limits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verification&lt;/strong&gt;: Ensures Pods are Running and healthy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notification&lt;/strong&gt;: Logs results to CloudWatch and sends SNS notifications.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Real-World Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Scenario 1: CrashLoopBackOff
&lt;/h3&gt;

&lt;p&gt;A Pod &lt;code&gt;nginx-pod&lt;/code&gt; is stuck due to an invalid command. KAAR:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Detects the issue with k8sgpt.
&lt;/li&gt;
&lt;li&gt;Uses Bedrock to classify it as CrashLoopBackOff.
&lt;/li&gt;
&lt;li&gt;Updates the Pod’s command.
&lt;/li&gt;
&lt;li&gt;Verifies it is Running.
&lt;/li&gt;
&lt;li&gt;Sends an SNS alert:
&amp;gt; “Pod nginx-pod in default is Healthy.”&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Scenario 2: OOMKilled
&lt;/h3&gt;

&lt;p&gt;A Pod &lt;code&gt;memory-hog&lt;/code&gt; is terminated for low memory. KAAR:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identifies the OOMKilled issue.
&lt;/li&gt;
&lt;li&gt;Increases the memory limit.
&lt;/li&gt;
&lt;li&gt;Confirms stability.
&lt;/li&gt;
&lt;li&gt;Sends an alert:
&amp;gt; “Pod memory-hog in default is Healthy.”&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Python 3.11 with boto3 and pyyaml:&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Choose KAAR?
&lt;/h2&gt;

&lt;p&gt;KAAR offers compelling benefits for Kubernetes admins:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Time Savings&lt;/strong&gt;: Automates remediation.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accuracy&lt;/strong&gt;: AI-driven classification and suggestions.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS Integration&lt;/strong&gt;: Works with Bedrock, SNS, CloudWatch.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Future-Ready&lt;/strong&gt;: Coming support for Services and Deployments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As an AWS Community Builder, I built KAAR to reflect AWS’s commitment to automation and innovation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started with KAAR
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;Ensure you have:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kubernetes Cluster&lt;/strong&gt;: AWS EKS, minikube, or kind.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   minikube start &lt;span class="nt"&gt;--driver&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;docker
   kubectl get nodes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AWS Credentials&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  aws configure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Python 3.11 with boto3 and pyyaml&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  python3.11 &lt;span class="nt"&gt;-m&lt;/span&gt; venv python311_env
  &lt;span class="nb"&gt;source &lt;/span&gt;python311_env/bin/activate
  pip &lt;span class="nb"&gt;install &lt;/span&gt;boto3 pyyaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;Install KAAR via pip&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt; pip &lt;span class="nb"&gt;install &lt;/span&gt;kaar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Create my_config.yaml&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt; aws:
  region: us-east-1
  sns_topic_arn: arn:aws:sns:us-east-1:YOUR_AWS_ACCOUNT_ID:KAARAlerts
  log_group: /kaar/notifications
  log_stream: kaar-notifier

k8sgpt:
  backend: amazonbedrock
  explain: &lt;span class="nb"&gt;true

&lt;/span&gt;bedrock:
  model: anthropic.claude-v2:1
  max_tokens: 50
  temperature: 0.5

remediation:
  max_attempts: 5
  retry_interval_seconds: 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Future Plans
&lt;/h2&gt;

&lt;p&gt;KAAR is poised to become a comprehensive Kubernetes management tool. Planned enhancements include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Service Remediation: Support for issues like SelectorMismatch and LoadBalancerPending.&lt;/li&gt;
&lt;li&gt;Deployment Support: Fixes for Deployment misconfigurations.&lt;/li&gt;
&lt;li&gt;EventBridge Integration: Scheduled runs for proactive monitoring, inspired by my KAAR project.&lt;/li&gt;
&lt;li&gt;Local LLMs: Support for Ollama as an alternative to Bedrock.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>kubernetes</category>
      <category>automation</category>
    </item>
    <item>
      <title>KAAAS: AI-Powered Kubernetes Cluster Analysis and Solutions</title>
      <dc:creator>Mohammed Kashif Rafi</dc:creator>
      <pubDate>Mon, 19 May 2025 16:24:04 +0000</pubDate>
      <link>https://forem.com/kashifrafi/kaaas-ai-powered-kubernetes-cluster-analysis-and-solutions-d56</link>
      <guid>https://forem.com/kashifrafi/kaaas-ai-powered-kubernetes-cluster-analysis-and-solutions-d56</guid>
      <description>&lt;h2&gt;
  
  
  KAAAS: AI-Powered Kubernetes Cluster Analysis and Solutions
&lt;/h2&gt;

&lt;p&gt;Kubernetes has revolutionized container orchestration, enabling organizations to manage complex, distributed applications at scale. However, with great power comes great complexity—troubleshooting Kubernetes clusters can be a daunting task, especially in production environments with thousands of resources.&lt;/p&gt;

&lt;p&gt;Enter &lt;strong&gt;KAAAS (Kubernetes AI-powered Cluster Analysis and Solution)&lt;/strong&gt;, a cutting-edge tool designed to simplify Kubernetes management by harnessing the power of artificial intelligence (AI). In this blog post, we’ll explore what KAAAS is, how it works, and why it’s a game-changer for DevOps teams and Site Reliability Engineers (SREs) looking to maintain healthy Kubernetes clusters.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is KAAAS?
&lt;/h2&gt;

&lt;p&gt;KAAAS is an AI-driven framework that automates the analysis, diagnosis, and resolution of issues in Kubernetes clusters. Built with the goal of reducing the operational burden on teams, KAAAS integrates with Kubernetes environments to scan clusters, identify potential problems, and provide actionable solutions in a user-friendly format.&lt;/p&gt;

&lt;p&gt;Unlike traditional monitoring tools that often overwhelm users with raw data, KAAAS leverages AI to interpret complex diagnostic information and present it in plain English, making it accessible to both seasoned Kubernetes experts and newcomers.&lt;/p&gt;

&lt;p&gt;At its core, KAAAS combines the capabilities of tools like &lt;strong&gt;K8sGPT&lt;/strong&gt;—a CNCF Sandbox project that uses AI to diagnose Kubernetes issues—with custom agents for cluster analysis, issue identification, and notification. By integrating with AI backends such as &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; or &lt;strong&gt;Ollama&lt;/strong&gt;, KAAAS transforms raw cluster data into meaningful insights, helping teams maintain cluster health without getting bogged down in the intricacies of Kubernetes internals.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Do We Need AI in Kubernetes Management?
&lt;/h2&gt;

&lt;p&gt;Kubernetes clusters are inherently complex, often spanning multiple nodes and managing thousands of resources like pods, services, and deployments. Common issues—such as pods stuck in a &lt;code&gt;CrashLoopBackOff&lt;/code&gt; state, image pull failures, or resource misconfigurations—can be time-consuming to diagnose and fix manually.&lt;/p&gt;

&lt;p&gt;Traditional tools like &lt;code&gt;kubectl&lt;/code&gt; provide raw data, but interpreting logs, events, and metrics requires deep expertise and significant time investment.&lt;/p&gt;

&lt;p&gt;AI-powered tools like KAAAS address these challenges by automating the diagnostic process. They can analyze vast amounts of cluster data, identify patterns, and pinpoint the root cause of issues faster than a human could. Moreover, AI tools can suggest solutions based on best practices and historical data, reducing the &lt;strong&gt;mean time to resolution (MTTR)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For organizations running mission-critical workloads on Kubernetes, this automation is invaluable—it minimizes downtime, improves reliability, and frees up engineering teams to focus on innovation rather than firefighting.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Does KAAAS Work?
&lt;/h2&gt;

&lt;p&gt;KAAAS operates through a series of modular agents, each responsible for a specific aspect of cluster analysis and management.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Cluster Scanning with K8sGPT
&lt;/h3&gt;

&lt;p&gt;KAAAS uses &lt;strong&gt;K8sGPT&lt;/strong&gt; to scan Kubernetes clusters for issues. K8sGPT, a powerful open-source tool, integrates with AI backends to analyze cluster resources like pods, deployments, and services. It identifies critical issues—such as a pod failing to schedule due to resource constraints or a service with no endpoints—and generates detailed explanations in plain English.&lt;/p&gt;

&lt;p&gt;KAAAS builds on this capability by automating the scanning process and tailoring the analysis to specific cluster configurations.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Custom Agents for Analysis
&lt;/h3&gt;

&lt;p&gt;KAAAS employs several custom agents to process the raw output from K8sGPT and other tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ClusterLimitAgent&lt;/strong&gt;: Checks the number of clusters based on node count, ensuring compliance with licensing limits (e.g., KAAAS is free for one cluster).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;K8sGPTAnalysisAgent&lt;/strong&gt;: Runs K8sGPT scans and captures detailed output, including error messages and suggested fixes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OutputParserAgent&lt;/strong&gt;: Parses K8sGPT output to extract structured information about issues, such as error details, affected resources, and potential solutions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IssueIdentificationAgent&lt;/strong&gt;: Identifies the type of issue (e.g., &lt;code&gt;ImagePullBackOff&lt;/code&gt;, &lt;code&gt;CrashLoopBackOff&lt;/code&gt;) and provides context about the affected resource, including its namespace and name.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These agents work together to transform raw data into actionable insights, ensuring that users don’t have to sift through logs manually.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Notification and Logging
&lt;/h3&gt;

&lt;p&gt;Once issues are identified, KAAAS uses AWS services like &lt;strong&gt;SNS (Simple Notification Service)&lt;/strong&gt; and &lt;strong&gt;CloudWatch&lt;/strong&gt; to notify users and log events.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;NotificationAgent&lt;/strong&gt; sends alerts via SNS, summarizing the issues and their solutions in a concise format.&lt;/li&gt;
&lt;li&gt;It also logs detailed information to CloudWatch, providing a historical record for auditing and further analysis.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This integration ensures that teams are promptly informed of critical issues and can take action without delay.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Integration with AWS Resources
&lt;/h3&gt;

&lt;p&gt;KAAAS seamlessly integrates with AWS resources, leveraging tools like &lt;strong&gt;AWS Controllers for Kubernetes (ACK)&lt;/strong&gt; to manage not only Kubernetes resources but also associated AWS services. For example, if a cluster issue is related to an AWS-managed resource (e.g., an RDS database or an S3 bucket), KAAAS can analyze and provide recommendations for those resources as well, offering a holistic view of the environment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Benefits of Using KAAAS
&lt;/h2&gt;

&lt;p&gt;KAAAS offers several advantages for teams managing Kubernetes clusters:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Simplified Troubleshooting
&lt;/h3&gt;

&lt;p&gt;By automating the analysis and diagnosis of cluster issues, KAAAS eliminates the need for manual log diving. Its AI-driven approach provides clear, actionable insights, making it easier for teams to resolve issues quickly.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Proactive Issue Detection
&lt;/h3&gt;

&lt;p&gt;KAAAS continuously monitors clusters, identifying potential problems before they escalate. For example, it can detect a pod in a &lt;code&gt;Pending&lt;/code&gt; state due to resource constraints and suggest adjustments to resource limits or node scaling.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. AI-Powered Automatic Analysis and Notifications:
&lt;/h3&gt;

&lt;p&gt;KAAAS scans your cluster regularly, finds problems, suggests solutions using AI, and notifies the admin—keeping your environment healthy with minimal effort..&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Enhanced Collaboration
&lt;/h3&gt;

&lt;p&gt;The notification system in KAAAS fosters collaboration by keeping all team members informed of cluster health. Whether you’re a DevOps engineer, an SRE, or a developer, you’ll have access to the same insights, enabling faster decision-making.&lt;/p&gt;




&lt;h2&gt;
  
  
  High level flow
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwyrr00g7fohtu18kzx65.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwyrr00g7fohtu18kzx65.jpg" alt=" " width="570" height="882"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites for Using KAAAS
&lt;/h2&gt;

&lt;p&gt;Before you can use KAAAS, ensure you have the following prerequisites in place:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Kubernetes Cluster&lt;/strong&gt;: A running Kubernetes cluster (self-managed, on a cloud provider like AWS EKS, or a local setup like Minikube).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;kubectl&lt;/strong&gt;: The Kubernetes command-line tool installed and configured to interact with your cluster.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;K8sGPT&lt;/strong&gt;: KAAAS relies on K8sGPT for cluster scanning. Install it using a package manager like &lt;code&gt;brew install k8sgpt&lt;/code&gt; or download the binary from the &lt;a href="https://github.com/k8sgpt-ai/k8sgpt" rel="noopener noreferrer"&gt;K8sGPT GitHub repository&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python 3.11+&lt;/strong&gt;: KAAAS is a Python-based tool, and we recommend using Python 3.11 or above for optimal compatibility and performance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pip&lt;/strong&gt;: The Python package manager to install KAAAS and its dependencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS Credentials&lt;/strong&gt;: Configure your AWS credentials using the AWS CLI (&lt;code&gt;aws configure&lt;/code&gt;) or environment variables. Ensure you have permissions for SNS and CloudWatch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;K8sGPT Configuration&lt;/strong&gt;: Configure K8sGPT with an AI backend (e.g., Ollama or Amazon Bedrock). For example, set up a &lt;code&gt;k8sgpt.yaml&lt;/code&gt; file with your backend details, such as model name, base URL, and authentication tokens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;kubectl Access&lt;/strong&gt;: Ensure &lt;code&gt;kubectl&lt;/code&gt; is configured to access your cluster (&lt;code&gt;kubectl config view&lt;/code&gt; to verify).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Getting Started with KAAAS
&lt;/h2&gt;

&lt;p&gt;KAAAS is available on PyPI, making it easy to install and use. We recommend setting up a Python virtual environment to manage dependencies cleanly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Set Up a Python Virtual Environment
&lt;/h3&gt;

&lt;p&gt;Create and activate a virtual environment using Python 3.11 or above:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3.11 &lt;span class="nt"&gt;-m&lt;/span&gt; venv kaaas_env
&lt;span class="nb"&gt;source &lt;/span&gt;kaaas_env/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: With the virtual environment activated, install KAAAS using pip:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;kaaas
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will install KAAAS and its dependencies, including boto3 for AWS integration and pyyaml for configuration parsing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Set Up AWS Credentials
&lt;/h3&gt;

&lt;p&gt;Configure your AWS credentials to enable KAAAS to interact with SNS and CloudWatch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws configure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Provide your AWS Access Key ID, Secret Access Key, region (e.g., us-east-1), and output format.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Configure KAAAS
&lt;/h3&gt;

&lt;p&gt;Create a config.yaml file specifying your backend LLM, AWS region, SNS topic ARN, and CloudWatch log group/stream. Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;backend_llm: ollama &lt;span class="c"&gt;#Your AI backend&lt;/span&gt;
aws_region: us-east-1
sns_topic_arn: arn:aws:sns:us-east-1:xxxxxxxx:kaaasAlerts &lt;span class="c"&gt;# your SNS arn&lt;/span&gt;
log_group: /kaaas/notifications
log_stream: kaas
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Run KAAAS
&lt;/h3&gt;

&lt;p&gt;After installation, run KAAAS by providing the path to your configuration file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kaaas &lt;span class="nt"&gt;--config&lt;/span&gt; config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;KAAAS will scan your cluster, analyze issues using K8sGPT, and send notifications via SNS while logging details to CloudWatch.&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 6: Automating KAAAS with a Cron Job
&lt;/h3&gt;

&lt;p&gt;To ensure your Kubernetes cluster is regularly monitored and analyzed, you can automate KAAAS using a cron job. This helps maintain cluster health by running automated scans at scheduled intervals without manual intervention.&lt;/p&gt;

&lt;p&gt;Here’s an example &lt;code&gt;bash&lt;/code&gt; script to use in your cron job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;

&lt;span class="c"&gt;# Set PATH explicitly&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/root/python311_env/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"&lt;/span&gt;

&lt;span class="c"&gt;# Activate virtual environment&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt; /root/python311_env/bin/activate

&lt;span class="c"&gt;# Define config and log file paths&lt;/span&gt;
&lt;span class="nv"&gt;CONFIG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/root/KAAAS/config.yaml"&lt;/span&gt;
&lt;span class="nv"&gt;LOG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/root/KAAAS/kaaas-&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%F&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;.log"&lt;/span&gt;

&lt;span class="c"&gt;# Run the command&lt;/span&gt;
/root/python311_env/bin/kaaas &lt;span class="nt"&gt;--config&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CONFIG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;&amp;amp;1

&lt;span class="c"&gt;# Deactivate virtual environment&lt;/span&gt;
deactivate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run this script daily at 2 AM, add the following line to your crontab using crontab -e&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crontab &lt;span class="nt"&gt;-e&lt;/span&gt;
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x /root/KAAAS/run_kaaas_cron.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make sure to give execute permission to your script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x /root/KAAAS/run_kaaas_cron.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>agents</category>
      <category>kubernetes</category>
      <category>automation</category>
      <category>amazonbedrock</category>
    </item>
    <item>
      <title>Automated Guide to VPC Lattice Setup and East-West Traffic Testing in Amazon EKS</title>
      <dc:creator>Mohammed Kashif Rafi</dc:creator>
      <pubDate>Sun, 26 Jan 2025 05:33:08 +0000</pubDate>
      <link>https://forem.com/kashifrafi/automated-guide-to-vpc-lattice-setup-and-north-south-traffic-testing-in-amazon-eks-ak5</link>
      <guid>https://forem.com/kashifrafi/automated-guide-to-vpc-lattice-setup-and-north-south-traffic-testing-in-amazon-eks-ak5</guid>
      <description>&lt;p&gt;Modern applications require seamless and secure service-to-service communication, especially when operating at scale in the cloud. Amazon Elastic Kubernetes Service (EKS) combined with AWS VPC Lattice provides an efficient way to implement scalable, east-west communication and streamline traffic management between services.&lt;/p&gt;

&lt;p&gt;This script is a step-by-step guide to deploying the AWS Gateway API Controller for EKS using VPC Lattice. It simplifies the setup process by automating key configurations, such as IAM policies, service accounts, and Gateway API installations, ensuring a smooth integration with your Kubernetes environment.&lt;/p&gt;

&lt;p&gt;By following this script, you'll:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Set up the AWS Gateway API Controller with Helm.&lt;/li&gt;
&lt;li&gt;Deploy GatewayClass and Gateway configurations for traffic routing.&lt;/li&gt;
&lt;li&gt;Establish and verify east-west connectivity using HTTPRoutes.&lt;/li&gt;
&lt;li&gt;Test service-to-service communication with DNS-based routing.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
!/bin/bash

# Define Variables (Modify these as per your setup)
CLUSTER_NAME="xxxxxxxxxxxxx" # Your cluster name
AWS_REGION="xxxxxx"  # Modify as per your region
AWS_ACCOUNT_ID="xxxxxxxxxxxxxxx" # Your account id
VPC_ID="vpc-xxxxxxxxxxxxxxxxx"  # Modify with your VPC ID

# Download the recommended inline policy for the controller installation
echo "Downloading recommended inline policy..."
curl https://raw.githubusercontent.com/aws/aws-application-networking-k8s/main/files/controller-installation/recommended-inline-policy.json -o recommended-inline-policy.json

# Create the IAM policy
echo "Creating IAM policy..."
aws iam create-policy \
    --policy-name VPCLatticeControllerIAMPolicy-eks-2 \
    --policy-document file://recommended-inline-policy.json

# Get the Policy ARN
VPCLatticeControllerIAMPolicyArn=$(aws iam list-policies --query 'Policies[?PolicyName==`VPCLatticeControllerIAMPolicy-eks-2`].Arn' --output text)

# Apply the controller installation manifest for the namespace
echo "Applying the controller installation manifest..."
kubectl apply -f https://raw.githubusercontent.com/aws/aws-application-networking-k8s/main/files/controller-installation/deploy-namesystem.yaml

# Enable OIDC provider for EKS
echo "Enabling OIDC provider..."
eksctl utils associate-iam-oidc-provider --cluster $CLUSTER_NAME --approve --region $AWS_REGION

# Create IAM Service Account for the pod
echo "Creating IAM service account..."
eksctl create iamserviceaccount \
    --cluster=$CLUSTER_NAME \
    --namespace=aws-application-networking-system \
    --name=gateway-api-controller \
    --attach-policy-arn=$VPCLatticeControllerIAMPolicyArn \
    --override-existing-serviceaccounts \
    --region $AWS_REGION \
    --approve

# Login to AWS ECR Public and install the controller using Helm
echo "Login to AWS ECR Public and installing Helm chart..."
aws ecr-public get-login-password --region us-east-1 | helm registry login --username AWS --password-stdin public.ecr.aws

# Install the Gateway API Controller with Helm
helm install gateway-api-controller \
    oci://public.ecr.aws/aws-application-networking-k8s/aws-gateway-controller-chart \
    --version=v1.0.6 \
    --set=serviceAccount.create=false \
    --namespace aws-application-networking-system \
    --set=awsRegion=$AWS_REGION \
    --set=clusterVpcId=$VPC_ID \
    --set=awsAccountId=$AWS_ACCOUNT_ID \
    --set=clusterName=$CLUSTER_NAME \
    --set=log.level=info

# Wait for the Pod to be in RUNNING state
echo "Waiting for the gateway-api-controller pod to be in RUNNING state..."
kubectl wait --namespace aws-application-networking-system \
  --for=condition=ready pod -l app.kubernetes.io/instance=gateway-api-controller \
  --timeout=300s  # Timeout after 5 minutes

# Wait for 3 minutes after Helm installation
echo "Waiting for 3 minutes after Helm installation..."
sleep 180  # Sleep for 180 seconds (3 minutes)

# Apply the GatewayClass manifest
echo "Creating GatewayClass..."
kubectl apply -f https://raw.githubusercontent.com/aws/aws-application-networking-k8s/main/files/controller-installation/gatewayclass.yaml

# Check if the Git repository exists, if not, clone it
GIT_REPO_DIR="aws-application-networking-k8s"  # Directory to check for Git clone
if [ ! -d "$GIT_REPO_DIR" ]; then
  echo "Cloning AWS Gateway API Controller repository..."
  git clone https://github.com/aws/aws-application-networking-k8s.git
else
  echo "Repository already cloned. Skipping git clone."
fi

# Navigate to the repository directory
cd $GIT_REPO_DIR

# Update the Helm chart for default service network
echo "Updating Helm chart for service network..."
aws ecr-public get-login-password --region us-east-1 | helm registry login --username AWS --password-stdin public.ecr.aws

helm upgrade gateway-api-controller \
    oci://public.ecr.aws/aws-application-networking-k8s/aws-gateway-controller-chart \
    --version=v1.0.6 \
    --reuse-values \
    --namespace aws-application-networking-system \
    --set=defaultServiceNetwork=my-hotel

# Wait for 3 minutes after updating the Helm chart
echo "Waiting for 3 minutes after updating Helm chart..."
sleep 180  # Sleep for 180 seconds (3 minutes)

# Test Deployemnt and North-South Connectivity Testing

# Apply the "my-hotel" gateway configuration
echo "Creating the 'my-hotel' gateway..."
kubectl apply -f /prod-eks-2/test/aws-application-networking-k8s/files/examples/my-hotel-gateway.yaml

# Verify that the Gateway was created successfully
echo "Verifying Gateway creation..."
kubectl get gateway

# Apply the HTTPRoute configurations
echo "Applying HTTPRoute configurations..."
kubectl apply -f /aws-application-networking-k8s/files/examples/parking.yaml
kubectl apply -f /aws-application-networking-k8s/files/examples/review.yaml
kubectl apply -f /aws-application-networking-k8s/files/examples/rate-route-path.yaml

# Apply additional HTTPRoute for inventory
echo "Applying inventory HTTPRoute..."
kubectl apply -f /aws-application-networking-k8s/files/examples/inventory-ver1.yaml
kubectl apply -f /aws-application-networking-k8s/files/examples/inventory-route.yaml

# Wait for 3 minutes after applying inventory HTTPRoute
echo "Waiting for 3 minutes after applying inventory HTTPRoute..."
sleep 180  # Sleep for 180 seconds (3 minutes)

# Check DNS names of HTTPRoutes
echo "Fetching DNS names for HTTPRoutes..."
ratesFQDN=$(kubectl get httproute rates -o json | jq -r '.metadata.annotations."application-networking.k8s.aws/lattice-assigned-domain-name"')
inventoryFQDN=$(kubectl get httproute inventory -o json | jq -r '.metadata.annotations."application-networking.k8s.aws/lattice-assigned-domain-name"')

# Display DNS names
echo -e "Rates FQDN: $ratesFQDN\nInventory FQDN: $inventoryFQDN"

# Verify service-to-service communication from inventory to parking and review services
echo "Verifying service-to-service communication..."
kubectl exec deploy/inventory-ver1 -- curl -s $ratesFQDN/parking $ratesFQDN/review
kubectl exec deploy/parking -- curl -s $inventoryFQDN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>lattice</category>
      <category>automation</category>
      <category>aws</category>
      <category>eks</category>
    </item>
  </channel>
</rss>
