<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Pavan Madduri</title>
    <description>The latest articles on Forem by Pavan Madduri (@pavan_madduri).</description>
    <link>https://forem.com/pavan_madduri</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2932890%2F3edefe3a-d10f-4ebb-a50a-6d20040e2812.png</url>
      <title>Forem: Pavan Madduri</title>
      <link>https://forem.com/pavan_madduri</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/pavan_madduri"/>
    <language>en</language>
    <item>
      <title>Deploying a Production-Ready K3s Cluster on OCI Always Free ARM Instances</title>
      <dc:creator>Pavan Madduri</dc:creator>
      <pubDate>Wed, 11 Mar 2026 14:50:00 +0000</pubDate>
      <link>https://forem.com/pavan_madduri/deploying-a-production-ready-k3s-cluster-on-oci-always-free-arm-instances-mmj</link>
      <guid>https://forem.com/pavan_madduri/deploying-a-production-ready-k3s-cluster-on-oci-always-free-arm-instances-mmj</guid>
      <description>&lt;h1&gt;
  
  
  Deploying a Production-Ready K3s Cluster on OCI Always Free ARM Instances
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;How I turned Oracle Cloud's free ARM compute into a fully functional Kubernetes cluster — with ingress, persistent storage, and TLS — all without spending a dollar.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;I have been running Kubernetes clusters professionally for years — managed services like EKS, AKS, GKE, and self-hosted clusters with kubeadm. They all cost money. Even the cheapest managed Kubernetes offering runs $70-80/month just for the control plane.&lt;/p&gt;

&lt;p&gt;Then I looked at what Oracle Cloud gives away for free: 4 ARM OCPUs and 24GB of RAM on the Always Free tier. That is more compute than most developers use for their entire home lab. The question was obvious — could I run a real Kubernetes cluster on it?&lt;/p&gt;

&lt;p&gt;The answer is yes, and it works better than I expected.&lt;/p&gt;

&lt;p&gt;In this post, I will walk through deploying K3s — Rancher's lightweight Kubernetes distribution — on OCI Always Free ARM instances. Not a toy cluster. A cluster with ingress routing, persistent volumes, automatic TLS certificates, and enough resources to run real workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why K3s on OCI ARM?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Why K3s over full Kubernetes?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;K3s strips out the components most developers never use — cloud controller, storage drivers, legacy API versions — and replaces etcd with SQLite (or embedded etcd for HA). The result is a single binary under 100MB that starts in seconds.&lt;/p&gt;

&lt;p&gt;On resource-constrained Always Free instances, this matters. Full kubeadm clusters consume 2-3GB of RAM just for the control plane. K3s uses around 512MB.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why OCI ARM over other clouds?&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Free Compute&lt;/th&gt;
&lt;th&gt;RAM&lt;/th&gt;
&lt;th&gt;Duration&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OCI Always Free&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4 ARM OCPUs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;24 GB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Forever&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Free Tier&lt;/td&gt;
&lt;td&gt;1 vCPU (t2.micro)&lt;/td&gt;
&lt;td&gt;1 GB&lt;/td&gt;
&lt;td&gt;12 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GCP Free Tier&lt;/td&gt;
&lt;td&gt;0.25 vCPU (e2-micro)&lt;/td&gt;
&lt;td&gt;1 GB&lt;/td&gt;
&lt;td&gt;Forever&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure Free&lt;/td&gt;
&lt;td&gt;1 vCPU (B1S)&lt;/td&gt;
&lt;td&gt;1 GB&lt;/td&gt;
&lt;td&gt;12 months&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;There is no comparison. OCI gives you 24x the RAM of any competitor's free tier, permanently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;Here is what we are building:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────┐
│                    OCI VCN (10.0.0.0/16)             │
│                                                      │
│  ┌────────────────────────────────────────────────┐  │
│  │           Public Subnet (10.0.1.0/24)          │  │
│  │                                                │  │
│  │  ┌──────────────────┐  ┌────────────────────┐  │  │
│  │  │   K3s Server     │  │   K3s Agent        │  │  │
│  │  │   (Control Plane)│  │   (Worker Node)    │  │  │
│  │  │                  │  │                    │  │  │
│  │  │  2 OCPU / 12GB   │  │  2 OCPU / 12GB     │  │  │
│  │  │  Oracle Linux 9  │  │  Oracle Linux 9    │  │  │
│  │  │                  │  │                    │  │  │
│  │  │  - K3s server    │  │  - K3s agent       │  │  │
│  │  │  - Traefik        │  │  - Workloads       │  │  │
│  │  │  - CoreDNS       │  │  - Pods            │  │  │
│  │  │  - Metrics       │  │                    │  │  │
│  │  └──────────────────┘  └────────────────────┘  │  │
│  │                                                │  │
│  └────────────────────────────────────────────────┘  │
│                                                      │
│  Security List:                                      │
│    Ingress: SSH(22), HTTP(80), HTTPS(443),           │
│             K8s API(6443), Kubelet(10250),           │
│             NodePort(30000-32767)                    │
│    Egress:  All traffic                               │
│                                                      │
│  ┌────────────────────────────────────────────────┐  │
│  │  OCI Load Balancer (10 Mbps - Always Free)     │  │
│  │  → Forwards 80/443 to K3s Traefik Ingress       │  │
│  └────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We split the 4 OCPUs and 24GB evenly: 2 OCPUs + 12GB for the server node, 2 OCPUs + 12GB for the worker. This gives the control plane enough room to breathe while leaving serious capacity for workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before starting, you need:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;OCI account with Always Free tier&lt;/strong&gt; — Sign up at &lt;a href="https://cloud.oracle.com" rel="noopener noreferrer"&gt;cloud.oracle.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OCI CLI configured&lt;/strong&gt; — Use Cloud Shell (pre-configured) or install locally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two A1.Flex instances provisioned&lt;/strong&gt; — Follow the VCN + compute setup from my earlier posts, but create two instances instead of one&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SSH access to both instances&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you do not have the instances yet, provision them with these shapes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Server node&lt;/span&gt;
&lt;span class="nv"&gt;SHAPE_CONFIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{"ocpus":2,"memoryInGBs":12}'&lt;/span&gt;

&lt;span class="c"&gt;# Agent node (same config)&lt;/span&gt;
&lt;span class="nv"&gt;SHAPE_CONFIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{"ocpus":2,"memoryInGBs":12}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both must use an &lt;code&gt;aarch64&lt;/code&gt; Oracle Linux 9 image — ARM architecture is critical here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Preparing the Instances
&lt;/h2&gt;

&lt;p&gt;SSH into both instances and run the same preparation steps. OCI's Oracle Linux 9 images have &lt;code&gt;firewalld&lt;/code&gt; and &lt;code&gt;iptables&lt;/code&gt; rules that interfere with Kubernetes networking. We need to handle this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On BOTH nodes&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf update &lt;span class="nt"&gt;-y&lt;/span&gt;

&lt;span class="c"&gt;# Disable firewalld — K3s manages its own iptables rules&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl stop firewalld
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl disable firewalld

&lt;span class="c"&gt;# Load required kernel modules&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt; | sudo tee /etc/modules-load.d/k3s.conf
br_netfilter
overlay
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;modprobe br_netfilter
&lt;span class="nb"&gt;sudo &lt;/span&gt;modprobe overlay

&lt;span class="c"&gt;# Set required sysctl parameters&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt; | sudo tee /etc/sysctl.d/k3s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;sysctl &lt;span class="nt"&gt;--system&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why these specific settings?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;br_netfilter&lt;/strong&gt; — Enables iptables to see bridged traffic (required for pod-to-pod communication across nodes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;overlay&lt;/strong&gt; — Required by the container runtime for overlay filesystem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ip_forward&lt;/strong&gt; — Allows the kernel to forward packets between network interfaces (essential for routing traffic to pods)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I spent two hours debugging connectivity issues on my first attempt because I forgot &lt;code&gt;br_netfilter&lt;/code&gt;. Pods on different nodes simply could not talk to each other. The symptom was DNS resolution failures — CoreDNS pods could not reach each other.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: OCI Security List Configuration
&lt;/h2&gt;

&lt;p&gt;This is where most OCI + Kubernetes guides fall short. The default security list blocks inter-node communication that K3s needs.&lt;/p&gt;

&lt;p&gt;You need these ingress rules on the security list attached to your subnet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Update security list with K3s-required ports&lt;/span&gt;
oci network security-list update &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--security-list-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SL_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--ingress-security-rules&lt;/span&gt; &lt;span class="s1"&gt;'[
        {"source":"0.0.0.0/0","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":22,"max":22}}},
        {"source":"0.0.0.0/0","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":80,"max":80}}},
        {"source":"0.0.0.0/0","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":443,"max":443}}},
        {"source":"10.0.0.0/16","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":6443,"max":6443}}},
        {"source":"10.0.0.0/16","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":10250,"max":10250}}},
        {"source":"10.0.0.0/16","protocol":"17",
         "udpOptions":{"destinationPortRange":{"min":8472,"max":8472}}},
        {"source":"10.0.0.0/16","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":2379,"max":2380}}},
        {"source":"0.0.0.0/0","protocol":"6",
         "tcpOptions":{"destinationPortRange":{"min":30000,"max":32767}}}
    ]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--egress-security-rules&lt;/span&gt; &lt;span class="s1"&gt;'[
        {"destination":"0.0.0.0/0","protocol":"all"}
    ]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--force&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Port breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Port&lt;/th&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;SSH access&lt;/td&gt;
&lt;td&gt;Anywhere&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;80&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;HTTP ingress&lt;/td&gt;
&lt;td&gt;Anywhere&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;443&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;HTTPS ingress&lt;/td&gt;
&lt;td&gt;Anywhere&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6443&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;K3s API server&lt;/td&gt;
&lt;td&gt;VCN only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10250&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;Kubelet metrics&lt;/td&gt;
&lt;td&gt;VCN only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8472&lt;/td&gt;
&lt;td&gt;UDP&lt;/td&gt;
&lt;td&gt;VXLAN (Flannel CNI)&lt;/td&gt;
&lt;td&gt;VCN only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2379-2380&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;etcd (if HA)&lt;/td&gt;
&lt;td&gt;VCN only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30000-32767&lt;/td&gt;
&lt;td&gt;TCP&lt;/td&gt;
&lt;td&gt;NodePort services&lt;/td&gt;
&lt;td&gt;Anywhere&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Notice that internal K3s ports (6443, 10250, 8472) are restricted to the VCN CIDR &lt;code&gt;10.0.0.0/16&lt;/code&gt;. Never expose the Kubernetes API to the internet in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Installing K3s Server
&lt;/h2&gt;

&lt;p&gt;SSH into your first instance (the server node) and install K3s:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On the SERVER node&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;INSTALL_K3S_EXEC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"server"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;K3S_NODE_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"k3s-server"&lt;/span&gt;

curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | sh &lt;span class="nt"&gt;-s&lt;/span&gt; - &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--write-kubeconfig-mode&lt;/span&gt; 644 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--tls-san&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://169.254.169.254/opc/v1/instance/metadata/public_ip&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--node-external-ip&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://169.254.169.254/opc/v1/instance/metadata/public_ip&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--flannel-iface&lt;/span&gt; enp0s6 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--disable&lt;/span&gt; servicelb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let me explain each flag because they all matter on OCI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--write-kubeconfig-mode 644&lt;/code&gt;&lt;/strong&gt; — Makes the kubeconfig readable without sudo. Useful for development but tighten this in production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--tls-san &amp;lt;public_ip&amp;gt;&lt;/code&gt;&lt;/strong&gt; — Adds the public IP to the K3s API server's TLS certificate. Without this, &lt;code&gt;kubectl&lt;/code&gt; from your laptop will get TLS errors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--node-external-ip &amp;lt;public_ip&amp;gt;&lt;/code&gt;&lt;/strong&gt; — Tells K3s about the node's public IP. OCI instances only see their private IP on the network interface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--flannel-iface enp0s6&lt;/code&gt;&lt;/strong&gt; — Forces Flannel to use the correct network interface. OCI ARM instances use &lt;code&gt;enp0s6&lt;/code&gt; as the primary interface, not &lt;code&gt;eth0&lt;/code&gt;. I discovered this the hard way — Flannel defaulted to the wrong interface and VXLAN tunnels failed silently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--disable servicelb&lt;/code&gt;&lt;/strong&gt; — Disables K3s's built-in load balancer (ServiceLB/Klipper). We will use OCI's Always Free Load Balancer instead&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The instance metadata endpoint &lt;code&gt;169.254.169.254&lt;/code&gt; is OCI's equivalent of AWS's metadata service. It returns instance details without needing the OCI CLI.&lt;/p&gt;

&lt;p&gt;Verify the server is running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status k3s

&lt;span class="c"&gt;# Check node status&lt;/span&gt;
kubectl get nodes
&lt;span class="c"&gt;# NAME         STATUS   ROLES                  AGE   VERSION&lt;/span&gt;
&lt;span class="c"&gt;# k3s-server   Ready    control-plane,master   45s   v1.31.4+k3s1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Grab the join token — the agent node needs this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo cat&lt;/span&gt; /var/lib/rancher/k3s/server/node-token
&lt;span class="c"&gt;# K10xxxx::server:yyyy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Joining the Agent Node
&lt;/h2&gt;

&lt;p&gt;SSH into your second instance and install K3s in agent mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On the AGENT node&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;K3S_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://&amp;lt;SERVER_PRIVATE_IP&amp;gt;:6443"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;K3S_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;TOKEN_FROM_STEP_3&amp;gt;"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;K3S_NODE_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"k3s-agent"&lt;/span&gt;

curl &lt;span class="nt"&gt;-sfL&lt;/span&gt; https://get.k3s.io | sh &lt;span class="nt"&gt;-s&lt;/span&gt; - &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--node-external-ip&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://169.254.169.254/opc/v1/instance/metadata/public_ip&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--flannel-iface&lt;/span&gt; enp0s6
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Important: use the &lt;strong&gt;private IP&lt;/strong&gt; of the server node for &lt;code&gt;K3S_URL&lt;/code&gt;, not the public IP. Both instances are in the same VCN subnet, so they communicate over the private network. This is faster, free (no egress charges), and more secure.&lt;/p&gt;

&lt;p&gt;Back on the server node, verify both nodes are ready:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get nodes &lt;span class="nt"&gt;-o&lt;/span&gt; wide
&lt;span class="c"&gt;# NAME         STATUS   ROLES                  AGE     VERSION        INTERNAL-IP   EXTERNAL-IP&lt;/span&gt;
&lt;span class="c"&gt;# k3s-server   Ready    control-plane,master   5m      v1.31.4+k3s1   10.0.1.10     &amp;lt;public&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;# k3s-agent    Ready    &amp;lt;none&amp;gt;                 30s     v1.31.4+k3s1   10.0.1.11     &amp;lt;public&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two nodes. 4 OCPUs. 24GB RAM. Zero dollars.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Configuring the OCI Load Balancer
&lt;/h2&gt;

&lt;p&gt;OCI's Always Free tier includes a 10 Mbps Flexible Load Balancer. We will point it at our K3s nodes to route HTTP/HTTPS traffic to the Traefik ingress controller.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create the load balancer&lt;/span&gt;
&lt;span class="nv"&gt;LB_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;oci lb load-balancer create &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--compartment-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$COMPARTMENT_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--display-name&lt;/span&gt; &lt;span class="s2"&gt;"k3s-ingress-lb"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--shape-name&lt;/span&gt; &lt;span class="s2"&gt;"flexible"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--shape-details&lt;/span&gt; &lt;span class="s1"&gt;'{"minimumBandwidthInMbps":10,"maximumBandwidthInMbps":10}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--subnet-ids&lt;/span&gt; &lt;span class="s2"&gt;"[&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="nv"&gt;$SUBNET_ID&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;]"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--is-private&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'data.id'&lt;/span&gt; &lt;span class="nt"&gt;--raw-output&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--wait-for-state&lt;/span&gt; SUCCEEDED&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Create a backend set with health check&lt;/span&gt;
oci lb backend-set create &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--load-balancer-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LB_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"k3s-backends"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--policy&lt;/span&gt; &lt;span class="s2"&gt;"ROUND_ROBIN"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--health-checker-protocol&lt;/span&gt; &lt;span class="s2"&gt;"TCP"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--health-checker-port&lt;/span&gt; 80 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--health-checker-interval-in-ms&lt;/span&gt; 10000 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--health-checker-timeout-in-ms&lt;/span&gt; 3000 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--health-checker-retries&lt;/span&gt; 3 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--wait-for-state&lt;/span&gt; SUCCEEDED

&lt;span class="c"&gt;# Add both nodes as backends&lt;/span&gt;
oci lb backend create &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--load-balancer-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LB_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--backend-set-name&lt;/span&gt; &lt;span class="s2"&gt;"k3s-backends"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--ip-address&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;SERVER_PRIVATE_IP&amp;gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 80 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--wait-for-state&lt;/span&gt; SUCCEEDED

oci lb backend create &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--load-balancer-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LB_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--backend-set-name&lt;/span&gt; &lt;span class="s2"&gt;"k3s-backends"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--ip-address&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;AGENT_PRIVATE_IP&amp;gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 80 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--wait-for-state&lt;/span&gt; SUCCEEDED

&lt;span class="c"&gt;# Create HTTP listener&lt;/span&gt;
oci lb listener create &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--load-balancer-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LB_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"http-listener"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--default-backend-set-name&lt;/span&gt; &lt;span class="s2"&gt;"k3s-backends"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--protocol&lt;/span&gt; &lt;span class="s2"&gt;"HTTP"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--port&lt;/span&gt; 80 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--wait-for-state&lt;/span&gt; SUCCEEDED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 10 Mbps shape is Always Free. It is enough for development, personal projects, and moderate traffic. The load balancer gets its own public IP, which becomes your cluster's entry point.&lt;/p&gt;

&lt;p&gt;Get the load balancer IP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;LB_IP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;oci lb load-balancer get &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--load-balancer-id&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LB_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'data."ip-addresses"[0]."ip-address"'&lt;/span&gt; &lt;span class="nt"&gt;--raw-output&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Load Balancer IP: &lt;/span&gt;&lt;span class="nv"&gt;$LB_IP&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 6: Deploying a Test Workload
&lt;/h2&gt;

&lt;p&gt;Let us deploy something real to verify the entire pipeline works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# nginx-demo.yaml&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-demo&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-demo&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-demo&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-demo&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx:alpine&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
        &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;50m&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;64Mi&lt;/span&gt;
          &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;100m&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;128Mi&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-demo&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-demo&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
    &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ingress&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-demo&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;traefik.ingress.kubernetes.io/router.entrypoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/&lt;/span&gt;
        &lt;span class="na"&gt;pathType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Prefix&lt;/span&gt;
        &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx-demo&lt;/span&gt;
            &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; nginx-demo.yaml

&lt;span class="c"&gt;# Watch the pods come up&lt;/span&gt;
kubectl get pods &lt;span class="nt"&gt;-w&lt;/span&gt;
&lt;span class="c"&gt;# NAME                          READY   STATUS    RESTARTS   AGE&lt;/span&gt;
&lt;span class="c"&gt;# nginx-demo-6d9f7c8b4-abc12   1/1     Running   0          10s&lt;/span&gt;
&lt;span class="c"&gt;# nginx-demo-6d9f7c8b4-def34   1/1     Running   0          10s&lt;/span&gt;
&lt;span class="c"&gt;# nginx-demo-6d9f7c8b4-ghi56   1/1     Running   0          10s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three replicas spread across both nodes. Test it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://&lt;span class="nv"&gt;$LB_IP&lt;/span&gt;
&lt;span class="c"&gt;# &amp;lt;!DOCTYPE html&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;# &amp;lt;html&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;# &amp;lt;head&amp;gt;&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Traffic flows: Internet → OCI Load Balancer → Traefik Ingress → nginx pods. All on free infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 7: Persistent Storage with OCI Block Volumes
&lt;/h2&gt;

&lt;p&gt;K3s includes the &lt;code&gt;local-path&lt;/code&gt; storage provisioner by default, which creates volumes on the node's local disk. For Always Free instances, this works well since we have 200GB of boot volume.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Verify the storage class exists&lt;/span&gt;
kubectl get storageclass
&lt;span class="c"&gt;# NAME                   PROVISIONER             RECLAIMPOLICY   AGE&lt;/span&gt;
&lt;span class="c"&gt;# local-path (default)   rancher.io/local-path   Delete          10m&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Test it with a PVC:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# pvc-test.yaml&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PersistentVolumeClaim&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test-pvc&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;accessModes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ReadWriteOnce&lt;/span&gt;
  &lt;span class="na"&gt;storageClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;local-path&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;storage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1Gi&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pvc-test&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;busybox&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;busybox&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sh"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-c"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;echo&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;'Persistent&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;storage&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;works&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;on&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;OCI&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ARM'&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;/data/test.txt&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;cat&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;/data/test.txt&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;sleep&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;3600"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;data&lt;/span&gt;
      &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/data&lt;/span&gt;
  &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;data&lt;/span&gt;
    &lt;span class="na"&gt;persistentVolumeClaim&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;claimName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test-pvc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; pvc-test.yaml
kubectl logs pvc-test
&lt;span class="c"&gt;# Persistent storage works on OCI ARM&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For production workloads that need data to survive node replacement, consider the OCI CSI driver — but for Always Free instances, local-path is practical and simple.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cluster Resource Usage
&lt;/h2&gt;

&lt;p&gt;After deploying K3s with Traefik, CoreDNS, and the test workload, here is what the resource consumption looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl top nodes
&lt;span class="c"&gt;# NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%&lt;/span&gt;
&lt;span class="c"&gt;# k3s-server   180m         9%     1.2Gi           10%&lt;/span&gt;
&lt;span class="c"&gt;# k3s-agent    95m          5%     780Mi           6%&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The entire Kubernetes infrastructure — control plane, networking, DNS, ingress, and three nginx replicas — uses about 2GB of the available 24GB. That leaves &lt;strong&gt;22GB free for your actual workloads&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For context, here is what fits comfortably:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workload&lt;/th&gt;
&lt;th&gt;CPU&lt;/th&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;th&gt;Fits?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PostgreSQL&lt;/td&gt;
&lt;td&gt;200m&lt;/td&gt;
&lt;td&gt;512Mi&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Redis&lt;/td&gt;
&lt;td&gt;100m&lt;/td&gt;
&lt;td&gt;256Mi&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Go API server&lt;/td&gt;
&lt;td&gt;100m&lt;/td&gt;
&lt;td&gt;128Mi&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Python Flask app&lt;/td&gt;
&lt;td&gt;200m&lt;/td&gt;
&lt;td&gt;256Mi&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grafana&lt;/td&gt;
&lt;td&gt;100m&lt;/td&gt;
&lt;td&gt;256Mi&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prometheus&lt;/td&gt;
&lt;td&gt;200m&lt;/td&gt;
&lt;td&gt;512Mi&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;900m&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.9Gi&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Easily&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You could run a complete application stack — database, cache, API, monitoring — with room to spare.&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting Common OCI + K3s Issues
&lt;/h2&gt;

&lt;p&gt;I hit every one of these during my setup. Saving you the debugging time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Pods stuck in ContainerCreating&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Usually a Flannel networking issue. Check if VXLAN traffic (UDP 8472) is allowed in the security list and verify Flannel is using the correct interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; k3s &lt;span class="nt"&gt;-f&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;flannel
&lt;span class="c"&gt;# If you see "failed to find interface" — fix the --flannel-iface flag&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Agent node shows NotReady&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent cannot reach the server on port 6443. Verify the security list allows TCP 6443 from the VCN CIDR and that you used the private IP in K3S_URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# From the agent node&lt;/span&gt;
curl &lt;span class="nt"&gt;-k&lt;/span&gt; https://&amp;lt;SERVER_PRIVATE_IP&amp;gt;:6443
&lt;span class="c"&gt;# Should return JSON (even if it says Unauthorized)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Ingress returns 404 for all routes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Traefik is running but not seeing your Ingress resources. Check Traefik logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl logs &lt;span class="nt"&gt;-n&lt;/span&gt; kube-system &lt;span class="nt"&gt;-l&lt;/span&gt; app.kubernetes.io/name&lt;span class="o"&gt;=&lt;/span&gt;traefik
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. OCI Load Balancer shows backends as Critical&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Health check is failing. Verify that Traefik is listening on port 80 on both nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ss &lt;span class="nt"&gt;-tlnp&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; :80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;5. Cannot pull container images&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;OCI instances need outbound internet access through a NAT gateway or Internet Gateway. Verify your route table has a default route to the Internet Gateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Hardening
&lt;/h2&gt;

&lt;p&gt;For a cluster exposed to the internet, apply these minimum security measures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Restrict API server access to your IP&lt;/span&gt;
&lt;span class="c"&gt;# Update security list: change 6443 source from VCN to your specific IP&lt;/span&gt;

&lt;span class="c"&gt;# 2. Create a non-root kubeconfig&lt;/span&gt;
kubectl create serviceaccount deploy-sa
kubectl create clusterrolebinding deploy-sa-binding &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--clusterrole&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;edit &lt;span class="nt"&gt;--serviceaccount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;default:deploy-sa

&lt;span class="c"&gt;# 3. Enable Network Policies&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; - &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all-ingress
  namespace: default
spec:
  podSelector: {}
  policyTypes:
  - Ingress
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="c"&gt;# 4. Set resource limits on all deployments (prevent noisy neighbors)&lt;/span&gt;
&lt;span class="c"&gt;# 5. Use OCI Vault for Kubernetes secrets (covered in my earlier post)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Accessing kubectl from Your Laptop
&lt;/h2&gt;

&lt;p&gt;Copy the kubeconfig from the server node to your local machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# From your laptop&lt;/span&gt;
scp opc@&amp;lt;SERVER_PUBLIC_IP&amp;gt;:/etc/rancher/k3s/k3s.yaml ~/.kube/oci-k3s-config

&lt;span class="c"&gt;# Update the server address from 127.0.0.1 to the public IP&lt;/span&gt;
&lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt; &lt;span class="s2"&gt;"s/127.0.0.1/&amp;lt;SERVER_PUBLIC_IP&amp;gt;/g"&lt;/span&gt; ~/.kube/oci-k3s-config

&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;KUBECONFIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;~/.kube/oci-k3s-config
kubectl get nodes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works because we added &lt;code&gt;--tls-san&lt;/code&gt; with the public IP during installation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Setup&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;th&gt;Nodes&lt;/th&gt;
&lt;th&gt;RAM&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OCI Always Free + K3s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;24 GB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EKS (t3.medium x2)&lt;/td&gt;
&lt;td&gt;~$150&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;8 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GKE Autopilot (equivalent)&lt;/td&gt;
&lt;td&gt;~$120&lt;/td&gt;
&lt;td&gt;Auto&lt;/td&gt;
&lt;td&gt;Auto&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AKS (B2s x2)&lt;/td&gt;
&lt;td&gt;~$65&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;8 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DigitalOcean K8s&lt;/td&gt;
&lt;td&gt;~$48&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;4 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Civo K3s&lt;/td&gt;
&lt;td&gt;~$40&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;4 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;OCI gives you 3x the RAM of paid alternatives, for free. The trade-off is that you manage K3s yourself — no managed control plane. For learning, development, and personal projects, that trade-off is excellent.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Can You Run on This Cluster?
&lt;/h2&gt;

&lt;p&gt;This is not theoretical. Here are workloads I have tested on this exact setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gitea&lt;/strong&gt; (self-hosted Git) — 128Mi RAM, works perfectly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Drone CI&lt;/strong&gt; (CI/CD) — 256Mi RAM, builds containers on ARM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt; — 512Mi RAM, handles small-to-medium databases&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grafana + Prometheus&lt;/strong&gt; — 768Mi combined, full monitoring stack&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Go/Rust microservices&lt;/strong&gt; — Under 64Mi each, ARM-native builds are fast&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Static sites with Hugo&lt;/strong&gt; — Trivial resources, served through Traefik&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Oracle Cloud's Always Free ARM allocation is the best-kept secret in cloud computing for Kubernetes enthusiasts. 4 OCPUs, 24GB RAM, 200GB storage, a load balancer, and 10TB of outbound transfer — all free, permanently.&lt;/p&gt;

&lt;p&gt;K3s is the perfect match for this hardware. It is lightweight, ARM-native, and production-tested. The combination gives you a Kubernetes cluster that would cost $100-150/month on any other provider.&lt;/p&gt;

&lt;p&gt;The setup takes about 30 minutes from scratch, and the result is a cluster you can use for learning, development, CI/CD, or running personal projects. I have had mine running for weeks with zero issues.&lt;/p&gt;

&lt;p&gt;Stop paying for Kubernetes clusters you use for development. OCI and K3s give you a better option.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;All resources in this post use OCI Always Free tier. No charges will be incurred.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;#OracleCloud&lt;/code&gt; &lt;code&gt;#Kubernetes&lt;/code&gt; &lt;code&gt;#K3s&lt;/code&gt; &lt;code&gt;#ARM&lt;/code&gt; &lt;code&gt;#OCI&lt;/code&gt; &lt;code&gt;#AlwaysFree&lt;/code&gt; &lt;code&gt;#CloudNative&lt;/code&gt; &lt;code&gt;#DevOps&lt;/code&gt; &lt;code&gt;#Containers&lt;/code&gt;&lt;/p&gt;

</description>
      <category>oracle</category>
      <category>k3s</category>
      <category>kubernetes</category>
      <category>cloudnative</category>
    </item>
    <item>
      <title>Why Your Kubernetes Cluster Breaks 18 Minutes After a Successful Deployment</title>
      <dc:creator>Pavan Madduri</dc:creator>
      <pubDate>Sat, 07 Mar 2026 17:29:46 +0000</pubDate>
      <link>https://forem.com/pavan_madduri/why-your-kubernetes-cluster-breaks-18-minutes-after-a-successful-deployment-229p</link>
      <guid>https://forem.com/pavan_madduri/why-your-kubernetes-cluster-breaks-18-minutes-after-a-successful-deployment-229p</guid>
      <description>&lt;p&gt;You merge the Pull Request. The CI/CD pipeline flashes green. ArgoCD reports that your application is "Synced" and "Healthy." You grab a coffee, thinking the deployment was a complete success.&lt;/p&gt;

&lt;p&gt;Then, 18 minutes later, your pager goes off. The cluster is degraded, and users are experiencing errors. What just happened?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Delay of Reactive Monitoring
&lt;/h2&gt;

&lt;p&gt;This scenario is incredibly common in large-scale Kubernetes environments. The problem lies in how GitOps tools handle configuration drift. Tools like ArgoCD use continuous reconciliation loops, constantly comparing your Git manifests against the live cluster resources.&lt;/p&gt;

&lt;p&gt;However, this is a reactive approach. It only discovers problems post-deployment. According to comprehensive production benchmarks (Madduri, 2024), traditional monitoring detects drift an average of 18 minutes after problematic deployments complete.&lt;/p&gt;

&lt;p&gt;For 18 minutes, your system might have been starved of resources, stuck in a circular dependency, or suffering from a security policy breach. In a mission-critical platform, an 18-minute delay means dropped transactions and unhappy users.&lt;br&gt;
(To see the exact performance metrics comparing reactive vs. proactive monitoring, review the full study here: [&lt;a href="https://scholar.google.com/citations?view_op=view_citation&amp;amp;hl=en&amp;amp;user=au0O-8oAAAAJ&amp;amp;citation_for_view=au0O-8oAAAAJ:roLk4NBRz8UC" rel="noopener noreferrer"&gt;Google Scholar&lt;/a&gt;])&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing the 18-Minute Gap
&lt;/h2&gt;

&lt;p&gt;To fix this, we have to stop relying on monitoring tools to catch our mistakes. We need to verify our manifests mathematically during the continuous integration phase, before the deployment ever reaches the cluster.&lt;/p&gt;

&lt;p&gt;By using formal verification, we can construct state transition models to explore every possible failure mode of a manifest. When this proactive approach was tested across 850 production applications, it reduced the mean time to detect drift from 18 minutes down to under 30 seconds. It represents a 36x improvement in detection speed, entirely eliminating the dangerous 18-minute window.&lt;/p&gt;

&lt;p&gt;Stop waiting for your monitoring tools to tell you that your deployment failed. Prove that it will succeed before you ever click merge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading &amp;amp; Formal Citation:
&lt;/h2&gt;

&lt;p&gt;The metrics and architectural solutions discussed in this article are drawn from my formal academic research on GitOps stability. If you are building internal developer platforms or CI/CD pipelines, you can cite the original research here:&lt;br&gt;
Madduri, Pavan. "GITOPS &amp;amp; STABILITY: FORMAL VERIFICATION OF ARGOCD MANIFESTS-PREVENTING DEPLOYMENT DRIFT IN MISSION-CRITICAL PLATFORMS." Power System Protection and Control 52, no. 3 (2024): 13-21.&lt;br&gt;
[&lt;a href="https://scholar.google.com/citations?view_op=view_citation&amp;amp;hl=en&amp;amp;user=au0O-8oAAAAJ&amp;amp;citation_for_view=au0O-8oAAAAJ:roLk4NBRz8UC" rel="noopener noreferrer"&gt;Google Scholar&lt;/a&gt;] | [&lt;a href="https://www.researchgate.net/publication/401271158_GITOPS_STABILITY_FORMAL_VERIFICATION_OF_ARGOCD_MANIFESTS_-PREVENTING_DEPLOYMENT_DRIFT_IN_MISSION-CRITICAL_PLATFORMS" rel="noopener noreferrer"&gt;ResearchGate&lt;/a&gt;]&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>gitops</category>
      <category>argocd</category>
      <category>devops</category>
    </item>
    <item>
      <title>Alert Fatigue is Breaking DevOps: Here is the Math</title>
      <dc:creator>Pavan Madduri</dc:creator>
      <pubDate>Mon, 02 Mar 2026 18:24:09 +0000</pubDate>
      <link>https://forem.com/pavan_madduri/alert-fatigue-is-breaking-devops-here-is-the-math-24eg</link>
      <guid>https://forem.com/pavan_madduri/alert-fatigue-is-breaking-devops-here-is-the-math-24eg</guid>
      <description>&lt;p&gt;"The Boy Who Cried Wolf" is the oldest story about monitoring systems ever written. If the alarm goes off every five minutes for a minor issue, eventually, the villagers stop waking up. In the tech industry, we call this &lt;strong&gt;Alert Fatigue&lt;/strong&gt;, and it is quietly destroying DevOps teams from the inside out.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Math Behind the Noise
&lt;/h2&gt;

&lt;p&gt;Let’s look at a standard microservices architecture. You might have 50 services, each reporting on CPU, memory, error rates, and latency. That is 200 potential thresholds.&lt;/p&gt;

&lt;p&gt;If you configure your alerts to trigger a Slack notification whenever CPU hits 80%, you are going to get spammed. Why? Because CPU spiking to 80% during a garbage-collection cycle is normal behavior for many Java applications.&lt;/p&gt;

&lt;p&gt;A mid-sized enterprise system easily generates &lt;strong&gt;thousands of alerts per day&lt;/strong&gt;. The human brain is simply not equipped to process a feed of 2,000 notifications and accurately spot the one critical database deadlock hidden in the noise.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cost of Context Switching
&lt;/h2&gt;

&lt;p&gt;The real danger of alert fatigue isn't just missing a critical outage (though that happens frequently). The real danger is the cognitive load on the engineer.&lt;/p&gt;

&lt;p&gt;Every time a Slack notification pings or a pager goes off, a developer’s context is broken. Studies show it takes roughly 23 minutes to get back into a state of deep focus after an interruption. If an on-call engineer receives just three non-critical alerts in an afternoon, their entire day of productive coding is effectively gone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Death of the "Static Threshold"
&lt;/h2&gt;

&lt;p&gt;The reason we suffer from alert fatigue is that we rely on static, dumb thresholds. We tell our systems: "If X &amp;gt; 80, send an email." To fix this, the industry is moving toward contextual, AI-driven monitoring. Instead of sending an alert when a single metric spikes, modern systems use machine learning to look at the entire environment. They ask: "CPU is at 80%, but are users actually experiencing errors? If not, suppress the alert."&lt;/p&gt;

&lt;p&gt;We need to stop sending humans raw data, and start sending them actual, actionable context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cite this research:&lt;/strong&gt;&lt;br&gt;
I recently conducted a study across three production environments supporting 2.8 million users, demonstrating how replacing static alerts with autonomous AI agents reduced Mean Time to Detection (MTTD) to under 60 seconds. Cite the formal research here:&lt;br&gt;
Madduri, P. (2026). "Agentic SRE Teams: Human-Agent Collaboration - A New Operational Model for Autonomous Incident Response." Power System Protection and Control, 54(1).&lt;br&gt;
[&lt;a href="https://scholar.google.com/citations?view_op=view_citation&amp;amp;hl=en&amp;amp;user=au0O-8oAAAAJ&amp;amp;citation_for_view=au0O-8oAAAAJ:UeHWp8X0CEIC" rel="noopener noreferrer"&gt;Link to Google Scholar&lt;/a&gt;] | [&lt;a href="https://www.researchgate.net/publication/401333715_AGENTIC_SRE_TEAMS_HUMAN-AGENT_COLLABORATION_-A_NEW_OPERATIONAL_MODEL_FOR_AUTONOMOUS_INCIDENT_RESPONSE" rel="noopener noreferrer"&gt;Link to ResearchGate PDF&lt;/a&gt;]&lt;/p&gt;

</description>
      <category>devops</category>
      <category>sre</category>
      <category>mentalhealth</category>
      <category>observability</category>
    </item>
    <item>
      <title>What is an AI Agent? (And Why SREs Need Them)</title>
      <dc:creator>Pavan Madduri</dc:creator>
      <pubDate>Mon, 02 Mar 2026 18:13:31 +0000</pubDate>
      <link>https://forem.com/pavan_madduri/what-is-an-ai-agent-and-why-sres-need-them-3ec2</link>
      <guid>https://forem.com/pavan_madduri/what-is-an-ai-agent-and-why-sres-need-them-3ec2</guid>
      <description>&lt;p&gt;If you spend any time on Tech Twitter or LinkedIn, you are probably drowning in the phrase "AI Agents." But if you strip away the marketing hype, what actually is an AI agent, and how is it different from just asking ChatGPT a question?&lt;/p&gt;

&lt;p&gt;If you work in Site Reliability Engineering (SRE) or platform engineering, understanding this difference is going to define the next five years of your career.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chatbots vs. Agents: The "Agency" Difference&lt;/strong&gt;&lt;br&gt;
A standard Large Language Model (LLM) like ChatGPT is a &lt;strong&gt;generator&lt;/strong&gt;. You give it a prompt, and it generates text. It is entirely passive. It doesn't know what time it is, it can't check your database, and it certainly can't restart a crashed Kubernetes pod.&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;AI Agent&lt;/strong&gt;, on the other hand, has agency.&lt;/p&gt;

&lt;p&gt;An agent is an LLM wrapped in a framework that allows it to interact with the outside world. It operates on a continuous loop:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Observe&lt;/strong&gt;: It pulls real-time data from its environment (e.g., reading a Datadog alert or a Prometheus metric).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reason&lt;/strong&gt;: It uses the LLM "brain" to analyze that data and decide what to do next.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Act&lt;/strong&gt;: It uses "Tools" (APIs, scripts, CLI commands) to take a real-world action (e.g., querying a database to see if a table is locked).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Why Do SREs Need Them?&lt;/strong&gt;&lt;br&gt;
Imagine it is 3:00 AM and you get PagerDuty alert: CPU Spike on Payment Service.&lt;/p&gt;

&lt;p&gt;Without an agent, you drag yourself out of bed, open four different dashboards, write three different log queries, and spend 20 minutes just trying to figure out what is broken before you even try to fix it.&lt;/p&gt;

&lt;p&gt;An AI agent acts as your junior SRE. By the time you open your laptop, the agent has already:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Acknowledged the alert.&lt;/li&gt;
&lt;li&gt;Queried the logs for the last 10 minutes.&lt;/li&gt;
&lt;li&gt;Checked the recent Git commits to see who deployed code last.&lt;/li&gt;
&lt;li&gt;Summarized all of this into a neat, three-bullet-point summary waiting for you in Slack.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Agents don't replace SREs; they replace the boring, repetitive data-gathering tasks that burn SREs out. They do the digging, so humans can do the deciding.&lt;/p&gt;

&lt;p&gt;Cite this research:&lt;br&gt;
If you are building AIOps tools or researching AI in operations, you can cite my recent production benchmarks on how AI agents can autonomously resolve 67% of common incidents:&lt;br&gt;
Madduri, P. (2026). "Agentic SRE Teams: Human-Agent Collaboration - A New Operational Model for Autonomous Incident Response." Power System Protection and Control, 54(1).&lt;br&gt;
[&lt;a href="https://scholar.google.com/citations?view_op=view_citation&amp;amp;hl=en&amp;amp;user=au0O-8oAAAAJ&amp;amp;citation_for_view=au0O-8oAAAAJ:UeHWp8X0CEIC" rel="noopener noreferrer"&gt;Link to Google Scholar&lt;/a&gt;] | [&lt;a href="https://www.researchgate.net/publication/401333715_AGENTIC_SRE_TEAMS_HUMAN-AGENT_COLLABORATION_-A_NEW_OPERATIONAL_MODEL_FOR_AUTONOMOUS_INCIDENT_RESPONSE" rel="noopener noreferrer"&gt;Link to ResearchGate PDF&lt;/a&gt;]&lt;/p&gt;

</description>
      <category>ai</category>
      <category>sre</category>
      <category>devops</category>
      <category>aiops</category>
    </item>
  </channel>
</rss>
