<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: InstaDevOps</title>
    <description>The latest articles on Forem by InstaDevOps (@instadevops).</description>
    <link>https://forem.com/instadevops</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2952358%2F474aa7f4-09cf-409d-891e-cfc4f071d18a.png</url>
      <title>Forem: InstaDevOps</title>
      <link>https://forem.com/instadevops</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/instadevops"/>
    <language>en</language>
    <item>
      <title>Building a DevSecOps Pipeline: Shift Security Left Without Slowing Down</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Thu, 16 Apr 2026 13:46:46 +0000</pubDate>
      <link>https://forem.com/instadevops/building-a-devsecops-pipeline-shift-security-left-without-slowing-down-31fl</link>
      <guid>https://forem.com/instadevops/building-a-devsecops-pipeline-shift-security-left-without-slowing-down-31fl</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Every week brings news of another data breach, supply chain attack, or misconfigured cloud resource exposing millions of records. The traditional approach of bolting security on at the end of the development cycle no longer works. By the time a penetration tester finds a vulnerability in staging, the code has been through weeks of development, and fixing it means expensive rework.&lt;/p&gt;

&lt;p&gt;DevSecOps integrates security testing directly into your CI/CD pipeline so vulnerabilities are caught in minutes, not months. The goal is not to slow down development - it is to catch security issues at the same speed you catch bugs: automatically, on every commit.&lt;/p&gt;

&lt;p&gt;This guide walks through building a practical DevSecOps pipeline with real tool configurations, covering static analysis, dependency scanning, container image scanning, infrastructure security, and policy-as-code.&lt;/p&gt;

&lt;h2&gt;
  
  
  The DevSecOps Pipeline Stages
&lt;/h2&gt;

&lt;p&gt;A mature DevSecOps pipeline runs security checks at every stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Code Commit → SAST → Dependency Scan → Build → Container Scan → IaC Scan → DAST → Deploy → Runtime Protection
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You do not need all of these on day one. Start with the highest-impact, lowest-friction stages and layer on additional checks as your team matures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 1: Static Application Security Testing (SAST)
&lt;/h2&gt;

&lt;p&gt;SAST tools analyze your source code for vulnerabilities without executing it. They catch issues like SQL injection, cross-site scripting, hardcoded secrets, and insecure cryptographic patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semgrep for Custom and Community Rules
&lt;/h3&gt;

&lt;p&gt;Semgrep is fast, supports 30+ languages, and lets you write custom rules in a simple YAML syntax:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/sast.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SAST Scan&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;semgrep&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;container&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;semgrep/semgrep&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;semgrep scan --config auto --error --sarif --output semgrep.sarif .&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github/codeql-action/upload-sarif@v3&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;sarif_file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;semgrep.sarif&lt;/span&gt;
        &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--config auto&lt;/code&gt; flag uses Semgrep's curated community ruleset. For stricter scanning, add specific rule packs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;semgrep scan &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--config&lt;/span&gt; p/security-audit &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--config&lt;/span&gt; p/secrets &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--config&lt;/span&gt; p/owasp-top-ten &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--error&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Writing Custom Security Rules
&lt;/h3&gt;

&lt;p&gt;Create organization-specific rules for patterns your team should avoid:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .semgrep/custom-rules.yml&lt;/span&gt;
&lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;no-raw-sql-queries&lt;/span&gt;
    &lt;span class="na"&gt;patterns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;db.query($SQL, ...)&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;pattern-not&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;db.query($SQL, [...])&lt;/span&gt;
    &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;parameterized&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;queries&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;prevent&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;SQL&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;injection"&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ERROR&lt;/span&gt;
    &lt;span class="na"&gt;languages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;javascript&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;typescript&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;no-console-log-in-production&lt;/span&gt;
    &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;console.log(...)&lt;/span&gt;
    &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Remove&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;console.log&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;before&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;merging&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;main"&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;WARNING&lt;/span&gt;
    &lt;span class="na"&gt;languages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;javascript&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;typescript&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;src/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Stage 2: Dependency and Supply Chain Scanning
&lt;/h2&gt;

&lt;p&gt;Your application's dependencies are a massive attack surface. A single vulnerable transitive dependency can compromise your entire system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scanning with Trivy
&lt;/h3&gt;

&lt;p&gt;Trivy scans not just containers but also filesystems for vulnerable dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/dependency-scan.yml&lt;/span&gt;
&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;scan-dependencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run Trivy filesystem scan&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aquasecurity/trivy-action@master&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;scan-type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;fs&lt;/span&gt;
          &lt;span class="na"&gt;scan-ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
          &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HIGH,CRITICAL&lt;/span&gt;
          &lt;span class="na"&gt;exit-code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
          &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sarif&lt;/span&gt;
          &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;trivy-fs.sarif&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github/codeql-action/upload-sarif@v3&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;sarif_file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;trivy-fs.sarif&lt;/span&gt;
        &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Software Bill of Materials (SBOM)
&lt;/h3&gt;

&lt;p&gt;Generate an SBOM for compliance and vulnerability tracking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate SBOM with Syft&lt;/span&gt;
syft packages &lt;span class="nb"&gt;dir&lt;/span&gt;:. &lt;span class="nt"&gt;-o&lt;/span&gt; spdx-json &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; sbom.spdx.json

&lt;span class="c"&gt;# Scan the SBOM for vulnerabilities with Grype&lt;/span&gt;
grype sbom:sbom.spdx.json &lt;span class="nt"&gt;--fail-on&lt;/span&gt; high
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Automated Dependency Updates
&lt;/h3&gt;

&lt;p&gt;Use Renovate or Dependabot to keep dependencies current:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;renovate.json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"$schema"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://docs.renovatebot.com/renovate-schema.json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"extends"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"config:recommended"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vulnerabilityAlerts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"labels"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"security"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"packageRules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"matchUpdateTypes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"patch"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"automerge"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"automergeType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"branch"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"matchUpdateTypes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"major"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"labels"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"breaking-change"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"assignees"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"team-lead"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Stage 3: Container Image Scanning
&lt;/h2&gt;

&lt;p&gt;Container images often contain vulnerable OS packages, misconfigured permissions, and unnecessary tools that increase the attack surface.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scanning During Build
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build-and-scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build image&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker build -t myapp:${{ github.sha }} .&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Scan image with Trivy&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aquasecurity/trivy-action@master&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;image-ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp:${{ github.sha }}&lt;/span&gt;
          &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HIGH,CRITICAL&lt;/span&gt;
          &lt;span class="na"&gt;exit-code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Check for root user&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;USER=$(docker inspect --format='{{.Config.User}}' myapp:${{ github.sha }})&lt;/span&gt;
          &lt;span class="s"&gt;if [ -z "$USER" ] || [ "$USER" = "root" ]; then&lt;/span&gt;
            &lt;span class="s"&gt;echo "ERROR: Container runs as root. Add a USER directive to your Dockerfile."&lt;/span&gt;
            &lt;span class="s"&gt;exit 1&lt;/span&gt;
          &lt;span class="s"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hardening Dockerfiles
&lt;/h3&gt;

&lt;p&gt;A secure Dockerfile follows these patterns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# Use specific version, not :latest&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;node:20.11-alpine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;builder&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; package*.json ./&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;npm ci &lt;span class="nt"&gt;--only&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;production

&lt;span class="c"&gt;# Multi-stage build - no build tools in final image&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node:20.11-alpine&lt;/span&gt;

&lt;span class="c"&gt;# Create non-root user&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;addgroup &lt;span class="nt"&gt;-g&lt;/span&gt; 1001 &lt;span class="nt"&gt;-S&lt;/span&gt; appuser &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    adduser &lt;span class="nt"&gt;-S&lt;/span&gt; appuser &lt;span class="nt"&gt;-u&lt;/span&gt; 1001 &lt;span class="nt"&gt;-G&lt;/span&gt; appuser

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /app/node_modules ./node_modules&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --chown=appuser:appuser . .&lt;/span&gt;

&lt;span class="c"&gt;# Drop all capabilities&lt;/span&gt;
&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; appuser&lt;/span&gt;

&lt;span class="c"&gt;# Use dumb-init to handle PID 1 properly&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;apk add &lt;span class="nt"&gt;--no-cache&lt;/span&gt; dumb-init
&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; ["dumb-init", "--"]&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["node", "server.js"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Stage 4: Infrastructure as Code Security
&lt;/h2&gt;

&lt;p&gt;Misconfigured infrastructure is the leading cause of cloud breaches. Scan your Terraform, CloudFormation, and Kubernetes manifests before they reach production.&lt;/p&gt;

&lt;h3&gt;
  
  
  Checkov for IaC Scanning
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;iac-scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run Checkov&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridgecrewio/checkov-action@v12&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;directory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform/&lt;/span&gt;
          &lt;span class="na"&gt;framework&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform&lt;/span&gt;
          &lt;span class="na"&gt;soft_fail&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="na"&gt;output_format&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sarif&lt;/span&gt;
          &lt;span class="na"&gt;output_file_path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;checkov.sarif&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github/codeql-action/upload-sarif@v3&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;sarif_file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;checkov.sarif&lt;/span&gt;
        &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Checkov catches misconfigurations like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 buckets without encryption&lt;/li&gt;
&lt;li&gt;Security groups open to 0.0.0.0/0&lt;/li&gt;
&lt;li&gt;RDS instances without encryption at rest&lt;/li&gt;
&lt;li&gt;IAM policies with wildcard permissions&lt;/li&gt;
&lt;li&gt;EBS volumes without encryption&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  tfsec for Terraform-Specific Scanning
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run tfsec on your Terraform directory&lt;/span&gt;
tfsec terraform/ &lt;span class="nt"&gt;--format&lt;/span&gt; sarif &lt;span class="nt"&gt;--out&lt;/span&gt; tfsec.sarif

&lt;span class="c"&gt;# Common findings:&lt;/span&gt;
&lt;span class="c"&gt;# - aws-ec2-no-public-ingress-sgr&lt;/span&gt;
&lt;span class="c"&gt;# - aws-s3-enable-bucket-encryption&lt;/span&gt;
&lt;span class="c"&gt;# - aws-iam-no-policy-wildcards&lt;/span&gt;
&lt;span class="c"&gt;# - aws-rds-encrypt-instance-storage-data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Stage 5: Dynamic Application Security Testing (DAST)
&lt;/h2&gt;

&lt;p&gt;DAST tools test your running application by sending requests and analyzing responses for vulnerabilities. Run these against a staging environment after deployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  OWASP ZAP Baseline Scan
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;dast&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy-staging&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;OWASP ZAP Baseline Scan&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;zaproxy/action-baseline@v0.12.0&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://staging.example.com&lt;/span&gt;
          &lt;span class="na"&gt;rules_file_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;zap-rules.tsv&lt;/span&gt;
          &lt;span class="na"&gt;fail_action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="na"&gt;cmd_options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-a&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-j'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;DAST scans are slower (minutes to hours) and noisier than SAST, so run them post-deployment rather than on every commit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 6: Policy-as-Code with Open Policy Agent
&lt;/h2&gt;

&lt;p&gt;Policy-as-Code enforces organizational rules programmatically. Open Policy Agent (OPA) lets you write policies in Rego that gate deployments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rego"&gt;&lt;code&gt;&lt;span class="c1"&gt;# policy/kubernetes.rego&lt;/span&gt;
&lt;span class="ow"&gt;package&lt;/span&gt; &lt;span class="n"&gt;kubernetes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;admission&lt;/span&gt;

&lt;span class="n"&gt;deny&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"Deployment"&lt;/span&gt;
  &lt;span class="n"&gt;container&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;limits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;
  &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Container '%v' must have memory limits set"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;deny&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"Deployment"&lt;/span&gt;
  &lt;span class="n"&gt;container&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;
  &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"@sha256:"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;regex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"^.*:[0-9]+\\.[0-9]+\\.[0-9]+$"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Container '%v' must use a specific version tag or digest, not :latest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;deny&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"Deployment"&lt;/span&gt;
  &lt;span class="n"&gt;container&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;privileged&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Container '%v' must not run in privileged mode"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;container&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Conftest for CI Integration
&lt;/h3&gt;

&lt;p&gt;Use Conftest to run OPA policies against Kubernetes manifests, Terraform plans, and Dockerfiles in CI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Test Kubernetes manifests&lt;/span&gt;
conftest &lt;span class="nb"&gt;test &lt;/span&gt;k8s/deployment.yaml &lt;span class="nt"&gt;--policy&lt;/span&gt; policy/

&lt;span class="c"&gt;# Test Terraform plans&lt;/span&gt;
terraform plan &lt;span class="nt"&gt;-out&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tfplan
terraform show &lt;span class="nt"&gt;-json&lt;/span&gt; tfplan &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; tfplan.json
conftest &lt;span class="nb"&gt;test &lt;/span&gt;tfplan.json &lt;span class="nt"&gt;--policy&lt;/span&gt; policy/terraform/

&lt;span class="c"&gt;# Test Dockerfiles&lt;/span&gt;
conftest &lt;span class="nb"&gt;test &lt;/span&gt;Dockerfile &lt;span class="nt"&gt;--policy&lt;/span&gt; policy/docker/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Putting It All Together
&lt;/h2&gt;

&lt;p&gt;Here is a complete pipeline that ties all stages together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DevSecOps Pipeline&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# Fast checks first (&amp;lt; 2 minutes)&lt;/span&gt;
  &lt;span class="na"&gt;secrets-scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;trufflesecurity/trufflehog@main&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;extra_args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;--only-verified&lt;/span&gt;

  &lt;span class="na"&gt;sast&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;docker run --rm -v "${PWD}:/src" semgrep/semgrep \&lt;/span&gt;
            &lt;span class="s"&gt;semgrep scan --config auto --error /src&lt;/span&gt;

  &lt;span class="c1"&gt;# Medium checks (2-5 minutes)&lt;/span&gt;
  &lt;span class="na"&gt;dependency-scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aquasecurity/trivy-action@master&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;scan-type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;fs&lt;/span&gt;
          &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HIGH,CRITICAL&lt;/span&gt;
          &lt;span class="na"&gt;exit-code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;

  &lt;span class="na"&gt;iac-scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridgecrewio/checkov-action@v12&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;directory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform/&lt;/span&gt;

  &lt;span class="c1"&gt;# Build and container scan&lt;/span&gt;
  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;secrets-scan&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;sast&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;dependency-scan&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;iac-scan&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker build -t app:${{ github.sha }} .&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aquasecurity/trivy-action@master&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;image-ref&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app:${{ github.sha }}&lt;/span&gt;
          &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HIGH,CRITICAL&lt;/span&gt;
          &lt;span class="na"&gt;exit-code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;

  &lt;span class="c1"&gt;# Deploy to staging, then DAST&lt;/span&gt;
  &lt;span class="na"&gt;deploy-staging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
    &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github.ref == 'refs/heads/main'&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;staging&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;echo "Deploy to staging"&lt;/span&gt;

  &lt;span class="na"&gt;dast&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy-staging&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;zaproxy/action-baseline@v0.12.0&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://staging.example.com&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Managing False Positives
&lt;/h2&gt;

&lt;p&gt;Every security scanning tool produces false positives. If you do not manage them, your team will start ignoring security alerts entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create a baseline.&lt;/strong&gt; On initial integration, capture existing findings as a baseline and only alert on new issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use inline suppressions with justification.&lt;/strong&gt; Every suppressed finding should include a reason:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# nosemgrep: python.lang.security.audit.insecure-hash.insecure-hash
# Justification: MD5 used for non-security cache key, not for cryptographic purposes
&lt;/span&gt;&lt;span class="n"&gt;cache_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;md5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Triage weekly.&lt;/strong&gt; Assign someone to review and triage findings weekly. Close false positives, create tickets for real issues, and tune rules that produce too much noise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Metrics That Matter
&lt;/h2&gt;

&lt;p&gt;Tracking the right metrics tells you whether your DevSecOps pipeline is actually improving security or just generating noise.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mean Time to Remediate (MTTR)
&lt;/h3&gt;

&lt;p&gt;How long does it take from when a vulnerability is detected to when it is fixed in production? Track this by severity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Critical:&lt;/strong&gt; Target under 24 hours&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High:&lt;/strong&gt; Target under 7 days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Medium:&lt;/strong&gt; Target under 30 days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low:&lt;/strong&gt; Target under 90 days&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Vulnerability Escape Rate
&lt;/h3&gt;

&lt;p&gt;What percentage of vulnerabilities make it past your CI/CD pipeline into production? This is your pipeline's effectiveness metric. Track it by comparing CI findings against production security scans and penetration test results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Coverage Percentage
&lt;/h3&gt;

&lt;p&gt;What percentage of your repositories have security scanning enabled? In most organizations, the answer is disturbingly low. Aim for 100% of production services covered by at minimum dependency scanning and SAST.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fix Rate vs Find Rate
&lt;/h3&gt;

&lt;p&gt;If you are finding vulnerabilities faster than you are fixing them, your backlog will grow indefinitely. This signals that your pipeline is too noisy (tune it) or your team needs more security training.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started: A Phased Approach
&lt;/h2&gt;

&lt;p&gt;Do not try to implement everything at once. Here is a realistic rollout plan:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 1-2: Secrets scanning and dependency scanning.&lt;/strong&gt; These have the highest signal-to-noise ratio and catch the most impactful issues. Enable Dependabot or Renovate for automated updates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Week 3-4: SAST with Semgrep.&lt;/strong&gt; Start with the default auto config. Do not write custom rules yet. Let the team get comfortable with the findings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Month 2: Container image scanning.&lt;/strong&gt; Add Trivy to your Docker build pipeline. Fix the critical and high findings, suppress the false positives with documentation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Month 3: IaC scanning with Checkov.&lt;/strong&gt; Scan your Terraform and Kubernetes manifests. This catches misconfigurations before they reach production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Month 4+: DAST and policy-as-code.&lt;/strong&gt; These require more setup and tuning but provide the deepest security coverage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Need Help with Your DevOps?
&lt;/h2&gt;

&lt;p&gt;Building a DevSecOps pipeline that catches real vulnerabilities without drowning your team in false positives takes careful tuning. At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we implement security-hardened CI/CD pipelines for startups and growing teams - baking security in from day one.&lt;/p&gt;

&lt;p&gt;Plans start at &lt;strong&gt;$2,999/mo&lt;/strong&gt; for a dedicated fractional DevOps engineer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a free 15-minute consultation&lt;/a&gt;&lt;/strong&gt; to discuss your security posture.&lt;/p&gt;

</description>
      <category>devsecops</category>
      <category>security</category>
      <category>cicd</category>
      <category>devops</category>
    </item>
    <item>
      <title>AWS ECS vs EKS: A Practical Decision Framework for Startups</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Wed, 15 Apr 2026 13:46:43 +0000</pubDate>
      <link>https://forem.com/instadevops/aws-ecs-vs-eks-a-practical-decision-framework-for-startups-3dgp</link>
      <guid>https://forem.com/instadevops/aws-ecs-vs-eks-a-practical-decision-framework-for-startups-3dgp</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;You have containerized your application and now you need to run it in production on AWS. The two main options stare back at you from the AWS console: Elastic Container Service (ECS) and Elastic Kubernetes Service (EKS). Both are fully managed, both run containers, and both integrate deeply with the AWS ecosystem.&lt;/p&gt;

&lt;p&gt;So which one should you pick?&lt;/p&gt;

&lt;p&gt;The answer is not "EKS because Kubernetes is the industry standard" and it is not "ECS because it is simpler." The right choice depends on your team size, operational maturity, workload characteristics, and growth trajectory. This article gives you a practical framework for making this decision, with real cost numbers and migration considerations.&lt;/p&gt;

&lt;h2&gt;
  
  
  ECS: The AWS-Native Path
&lt;/h2&gt;

&lt;p&gt;ECS is AWS's proprietary container orchestration service. It is deeply integrated with the AWS ecosystem and was purpose-built to make running containers on AWS as straightforward as possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  ECS Architecture
&lt;/h3&gt;

&lt;p&gt;ECS has two launch types:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;EC2 Launch Type:&lt;/strong&gt; You manage the underlying EC2 instances. ECS handles container placement and scheduling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fargate Launch Type:&lt;/strong&gt; Serverless containers. AWS manages the infrastructure entirely. You just define CPU, memory, and your container image.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"family"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"networkMode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"awsvpc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"requiresCompatibilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"FARGATE"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cpu"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"512"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"memory"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1024"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"containerDefinitions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"123456789012.dkr.ecr.eu-west-1.amazonaws.com/my-api:latest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"portMappings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"containerPort"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"protocol"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tcp"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"logConfiguration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"logDriver"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"awslogs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"options"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"awslogs-group"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/ecs/my-api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"awslogs-region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"awslogs-stream-prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ecs"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ECS Strengths
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Zero Kubernetes knowledge required.&lt;/strong&gt; Your team interacts with familiar AWS concepts: task definitions, services, target groups, IAM roles. There is no &lt;code&gt;kubectl&lt;/code&gt;, no YAML manifests with 15 indentation levels, no CRDs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deep AWS integration.&lt;/strong&gt; ECS natively integrates with ALB/NLB, CloudWatch, IAM task roles, Secrets Manager, Parameter Store, App Mesh, and CloudMap for service discovery. These integrations work out of the box with minimal configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lower operational overhead.&lt;/strong&gt; Especially with Fargate, there are no nodes to patch, no cluster upgrades to manage, no etcd to worry about. AWS handles all of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simpler networking.&lt;/strong&gt; The &lt;code&gt;awsvpc&lt;/code&gt; network mode gives each task its own ENI and security group. No overlay networks, no kube-proxy, no CNI plugin decisions.&lt;/p&gt;

&lt;h3&gt;
  
  
  ECS Limitations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Vendor lock-in to AWS&lt;/li&gt;
&lt;li&gt;Smaller community and ecosystem compared to Kubernetes&lt;/li&gt;
&lt;li&gt;Limited scheduling customization&lt;/li&gt;
&lt;li&gt;No equivalent to Kubernetes operators or CRDs for extending the platform&lt;/li&gt;
&lt;li&gt;Fewer third-party tools (no Helm, no ArgoCD, no Istio)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  EKS: The Kubernetes Path
&lt;/h2&gt;

&lt;p&gt;EKS is AWS's managed Kubernetes service. AWS manages the control plane (API server, etcd, scheduler), and you manage the worker nodes or use Fargate for serverless pods.&lt;/p&gt;

&lt;h3&gt;
  
  
  EKS Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Sample Kubernetes deployment&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-api&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;production&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-api&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-api&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;serviceAccountName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-api&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api&lt;/span&gt;
          &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;123456789012.dkr.ecr.eu-west-1.amazonaws.com/my-api:latest&lt;/span&gt;
          &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3000&lt;/span&gt;
          &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;250m&lt;/span&gt;
              &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;512Mi&lt;/span&gt;
            &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;500m&lt;/span&gt;
              &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1Gi&lt;/span&gt;
          &lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/health&lt;/span&gt;
              &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3000&lt;/span&gt;
            &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
          &lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/ready&lt;/span&gt;
              &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3000&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-api&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-api&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
      &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  EKS Strengths
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Portability.&lt;/strong&gt; Kubernetes manifests work on any cloud provider or on-premises cluster. If multi-cloud or hybrid-cloud is in your future, EKS keeps your options open.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rich ecosystem.&lt;/strong&gt; Helm charts, ArgoCD for GitOps, Istio or Linkerd for service mesh, Prometheus and Grafana for monitoring, cert-manager for TLS, external-dns for DNS automation. The Kubernetes ecosystem is enormous.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Advanced scheduling.&lt;/strong&gt; Node affinities, taints and tolerations, pod topology spread constraints, priority classes, and preemption give you fine-grained control over workload placement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extensibility.&lt;/strong&gt; Custom Resource Definitions and operators let you extend Kubernetes to manage databases, message queues, ML pipelines, and virtually anything else as native Kubernetes resources.&lt;/p&gt;

&lt;h3&gt;
  
  
  EKS Limitations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;$0.10/hour ($73/month) control plane cost before any compute&lt;/li&gt;
&lt;li&gt;Steep learning curve (expect 3-6 months for a team new to Kubernetes)&lt;/li&gt;
&lt;li&gt;Cluster upgrades require planning and testing (new version every 4 months)&lt;/li&gt;
&lt;li&gt;More moving parts: CNI plugins, ingress controllers, cluster autoscaler, DNS&lt;/li&gt;
&lt;li&gt;Significantly higher operational burden&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Cost Comparison
&lt;/h2&gt;

&lt;p&gt;Let us compare the cost of running a typical startup workload: 3 services, each running 2 replicas with 0.5 vCPU and 1 GB RAM.&lt;/p&gt;

&lt;h3&gt;
  
  
  ECS Fargate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Per task: 0.5 vCPU + 1 GB RAM
Fargate pricing (eu-west-1):
  vCPU: $0.04048/hour
  Memory: $0.004445/GB/hour

Per task/hour: (0.5 * $0.04048) + (1 * $0.004445) = $0.02469
6 tasks total: $0.02469 * 6 = $0.14814/hour

Monthly: $0.14814 * 730 = $108.14/month
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ECS EC2 (with Reserved Instances)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2x t3.medium (2 vCPU, 4 GB) with 1-year RI:
  On-demand: $0.0416/hour each
  1-year RI (no upfront): $0.027/hour each

Monthly: 2 * $0.027 * 730 = $39.42/month
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  EKS with Managed Node Group
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EKS control plane: $0.10/hour = $73/month
2x t3.medium nodes (same as ECS EC2):
  1-year RI: 2 * $0.027 * 730 = $39.42/month

Monthly total: $73 + $39.42 = $112.42/month
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  EKS Fargate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EKS control plane: $73/month
Fargate pods (same as ECS Fargate): $108.14/month
Plus CoreDNS pods on Fargate: ~$15/month

Monthly total: $73 + $108.14 + $15 = $196.14/month
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At small scale, ECS on EC2 is the cheapest option by far. EKS adds a fixed $73/month control plane cost that only becomes negligible at larger scales.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Decision Framework
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Choose ECS When
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Your team is small (under 10 engineers).&lt;/strong&gt; The operational overhead of Kubernetes is not justified when a few people can manage ECS with part-time attention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You are all-in on AWS.&lt;/strong&gt; If you have no plans for multi-cloud and are already using ALB, CloudWatch, and IAM extensively, ECS slots in naturally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You want Fargate's simplicity.&lt;/strong&gt; ECS Fargate is the easiest path from container image to running workload. No nodes, no cluster management, no upgrades.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your workloads are straightforward.&lt;/strong&gt; Web APIs, background workers, and scheduled tasks that do not need advanced scheduling or custom operators.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You need to ship fast.&lt;/strong&gt; A team new to containers can have ECS running in production within a week. Kubernetes takes months to operate confidently.&lt;/p&gt;

&lt;h3&gt;
  
  
  Choose EKS When
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Multi-cloud or hybrid is a real requirement.&lt;/strong&gt; Not a hypothetical future possibility, but an actual business constraint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You have Kubernetes experience on the team.&lt;/strong&gt; If your engineers already know Kubernetes, EKS employs that knowledge. Forcing Kubernetes-experienced engineers to learn ECS is a lateral move with no payoff.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You need the ecosystem.&lt;/strong&gt; If your architecture requires service mesh, GitOps (ArgoCD), advanced traffic management, or custom operators, the Kubernetes ecosystem is unmatched.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You are running 20+ services.&lt;/strong&gt; At scale, Kubernetes' sophisticated scheduling, resource management, and namespace isolation become genuine advantages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You are building a platform team.&lt;/strong&gt; If part of your strategy is building an internal developer platform, Kubernetes provides the extensibility to build abstractions on top.&lt;/p&gt;

&lt;h2&gt;
  
  
  Migration Paths
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ECS to EKS
&lt;/h3&gt;

&lt;p&gt;If you outgrow ECS, migration is manageable:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Set up EKS cluster alongside existing ECS&lt;/li&gt;
&lt;li&gt;Convert task definitions to Kubernetes deployments (the container config is similar)&lt;/li&gt;
&lt;li&gt;Use AWS ALB Ingress Controller to maintain the same load balancer&lt;/li&gt;
&lt;li&gt;Migrate services one at a time, shifting traffic gradually&lt;/li&gt;
&lt;li&gt;Decommission ECS services after verification&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  EKS to ECS
&lt;/h3&gt;

&lt;p&gt;Rarer, but it happens when teams realize they over-invested in Kubernetes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Convert Kubernetes deployments to ECS task definitions&lt;/li&gt;
&lt;li&gt;Replace Kubernetes services with ECS services + ALB target groups&lt;/li&gt;
&lt;li&gt;Replace Helm/ArgoCD with AWS CodePipeline or similar&lt;/li&gt;
&lt;li&gt;Migrate ConfigMaps/Secrets to AWS Parameter Store/Secrets Manager&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  A Pragmatic Recommendation
&lt;/h2&gt;

&lt;p&gt;For most startups and SMBs running fewer than 15 services on AWS, &lt;strong&gt;start with ECS&lt;/strong&gt;. Here is why:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You avoid 3-6 months of Kubernetes learning curve&lt;/li&gt;
&lt;li&gt;You save $73/month in control plane costs (trivial individually, but symbolic of broader overhead)&lt;/li&gt;
&lt;li&gt;You eliminate cluster upgrade maintenance windows&lt;/li&gt;
&lt;li&gt;Your engineers focus on product features instead of infrastructure plumbing&lt;/li&gt;
&lt;li&gt;If you outgrow ECS, the migration to EKS is well-understood&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The exception: if you already have a team member who is deeply experienced with Kubernetes and can own the cluster operations, EKS is fine from day one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operational Overhead: What Nobody Tells You
&lt;/h2&gt;

&lt;p&gt;The cost comparison above only covers compute. The real difference between ECS and EKS is operational overhead - the time your engineers spend managing the orchestration platform itself rather than building product features.&lt;/p&gt;

&lt;h3&gt;
  
  
  ECS Day-2 Operations
&lt;/h3&gt;

&lt;p&gt;With ECS (especially Fargate), your ongoing operational tasks are minimal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Task definition updates&lt;/strong&gt; when you change container config (straightforward JSON/Terraform changes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Service scaling policies&lt;/strong&gt; adjustments based on traffic patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ALB health check tuning&lt;/strong&gt; to match your application's startup time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security group and IAM role updates&lt;/strong&gt; as your network requirements change&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most of this is standard AWS work that any cloud engineer can handle.&lt;/p&gt;

&lt;h3&gt;
  
  
  EKS Day-2 Operations
&lt;/h3&gt;

&lt;p&gt;EKS adds a significant layer of Kubernetes-specific operational work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cluster version upgrades&lt;/strong&gt; every 4 months (Kubernetes drops support for old versions aggressively). Each upgrade requires testing all workloads, updating add-ons, and often updating manifests for deprecated APIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add-on management&lt;/strong&gt;: CoreDNS, kube-proxy, VPC CNI, cluster autoscaler, ingress controller, cert-manager, external-dns - each has its own version matrix and upgrade path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Node group rotation&lt;/strong&gt; when you change AMIs or instance types. Cordon, drain, replace.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RBAC policy management&lt;/strong&gt; as teams and services grow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;etcd monitoring&lt;/strong&gt; for the control plane (managed by AWS, but you still need to understand its implications for API server performance).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging Kubernetes networking&lt;/strong&gt; when services cannot reach each other (is it a NetworkPolicy? A CNI issue? A security group? A missing service account?).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A reasonable estimate: EKS adds 10-20 hours per month of platform engineering work compared to ECS. At a senior engineer's loaded cost, that is $3,000-6,000/month in operational overhead that does not show up on your AWS bill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Scenarios
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Scenario 1: 4-Person Startup, 2 Services
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Choose ECS Fargate.&lt;/strong&gt; You do not have the engineering bandwidth to learn and operate Kubernetes. ECS Fargate lets you deploy containers in an afternoon and forget about infrastructure management for the next 12 months.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 2: 25-Person Scale-Up, 12 Services, AWS Only
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Choose ECS EC2.&lt;/strong&gt; You are big enough that Fargate costs add up, but your services are straightforward enough that Kubernetes' advanced features are not needed. ECS on EC2 with Reserved Instances gives you the best cost efficiency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 3: 50-Person Company, 30 Services, Multi-Cloud Mandate
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Choose EKS.&lt;/strong&gt; At this scale, you likely have or can hire a dedicated platform engineer. Kubernetes' portability and ecosystem justify the operational investment, and you will benefit from tools like ArgoCD, Istio, and custom operators.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 4: Regulated Industry, Compliance Requirements
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;EKS edges ahead.&lt;/strong&gt; Kubernetes' namespace isolation, network policies, pod security standards, and OPA Gatekeeper provide a richer set of compliance controls. Combined with tools like Falco for runtime security, you can build a compliance story that auditors understand and respect.&lt;/p&gt;

&lt;h2&gt;
  
  
  Need Help with Your DevOps?
&lt;/h2&gt;

&lt;p&gt;Choosing the right container orchestration platform is just the first decision. At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we help startups set up production-grade container infrastructure on AWS - whether that is ECS, EKS, or a migration between the two.&lt;/p&gt;

&lt;p&gt;Plans start at &lt;strong&gt;$2,999/mo&lt;/strong&gt; for a dedicated fractional DevOps engineer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a free 15-minute consultation&lt;/a&gt;&lt;/strong&gt; to discuss your container strategy.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ecs</category>
      <category>eks</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Advanced GitHub Actions: Matrix Builds, Reusable Workflows, and Self-Hosted Runners</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Tue, 14 Apr 2026 13:46:40 +0000</pubDate>
      <link>https://forem.com/instadevops/advanced-github-actions-matrix-builds-reusable-workflows-and-self-hosted-runners-h07</link>
      <guid>https://forem.com/instadevops/advanced-github-actions-matrix-builds-reusable-workflows-and-self-hosted-runners-h07</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;GitHub Actions has matured from a simple CI tool into a full-featured automation platform. Most teams start with a basic build-and-test workflow, but GitHub Actions offers far more powerful patterns that can dramatically reduce pipeline duplication, speed up builds, and cut CI costs.&lt;/p&gt;

&lt;p&gt;This article covers advanced patterns that separate hobby-level GitHub Actions usage from production-grade CI/CD: matrix builds for testing across multiple environments, reusable workflows for eliminating duplication, self-hosted runners for cost and performance, and secrets management strategies that keep your pipelines secure.&lt;/p&gt;

&lt;p&gt;If you are already comfortable writing basic GitHub Actions workflows, this guide will take you to the next level.&lt;/p&gt;

&lt;h2&gt;
  
  
  Matrix Builds: Test Everything in Parallel
&lt;/h2&gt;

&lt;p&gt;Matrix builds let you run the same job across multiple combinations of variables - OS versions, language versions, dependency versions - without duplicating workflow code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic Matrix Strategy
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;18&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;20&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;22&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
        &lt;span class="na"&gt;os&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;ubuntu-latest&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;macos-latest&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;fail-fast&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ matrix.node-version }}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setting &lt;code&gt;fail-fast: false&lt;/code&gt; ensures all matrix combinations run to completion even if one fails. This is critical for understanding the full scope of compatibility issues.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dynamic Matrix Generation
&lt;/h3&gt;

&lt;p&gt;For more complex scenarios, generate the matrix dynamically from a previous job:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;detect-services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ steps.find.outputs.services }}&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;find&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;# Find all directories with a Dockerfile&lt;/span&gt;
          &lt;span class="s"&gt;SERVICES=$(find services/ -name Dockerfile -exec dirname {} \; | jq -R -s -c 'split("\n")[:-1]')&lt;/span&gt;
          &lt;span class="s"&gt;echo "services=$SERVICES" &amp;gt;&amp;gt; $GITHUB_OUTPUT&lt;/span&gt;

  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;detect-services&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ fromJson(needs.detect-services.outputs.services) }}&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker build -t ${{ matrix.service }}:latest ${{ matrix.service }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern is invaluable for monorepos: only build and test services that have changed, and automatically pick up new services without modifying the workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Matrix Include and Exclude
&lt;/h3&gt;

&lt;p&gt;Fine-tune your matrix with include and exclude rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;os&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;ubuntu-latest&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;windows-latest&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;node&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;18&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;20&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="c1"&gt;# Add an experimental Node 22 build on Ubuntu only&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;os&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
        &lt;span class="na"&gt;node&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;22&lt;/span&gt;
        &lt;span class="na"&gt;experimental&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;exclude&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="c1"&gt;# Skip Node 18 on Windows (known incompatibility)&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;os&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;windows-latest&lt;/span&gt;
        &lt;span class="na"&gt;node&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;18&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Reusable Workflows: Eliminate Duplication
&lt;/h2&gt;

&lt;p&gt;If you have 10 microservices with nearly identical CI workflows, reusable workflows let you define the pipeline once and call it from each repository.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating a Reusable Workflow
&lt;/h3&gt;

&lt;p&gt;Define a workflow with &lt;code&gt;workflow_call&lt;/code&gt; trigger in a shared repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/build-and-deploy.yml in org/shared-workflows repo&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and Deploy Service&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;workflow_call&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;service-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
      &lt;span class="na"&gt;dockerfile-path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
        &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;./Dockerfile'&lt;/span&gt;
      &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;string&lt;/span&gt;
    &lt;span class="na"&gt;secrets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;ECR_REGISTRY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Configure AWS credentials&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;aws-access-key-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_ACCESS_KEY_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;aws-secret-access-key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_SECRET_ACCESS_KEY }}&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eu-west-1&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Login to ECR&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/amazon-ecr-login@v2&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and push image&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;IMAGE="${{ secrets.ECR_REGISTRY }}/${{ inputs.service-name }}:${{ github.sha }}"&lt;/span&gt;
          &lt;span class="s"&gt;docker build -f ${{ inputs.dockerfile-path }} -t $IMAGE .&lt;/span&gt;
          &lt;span class="s"&gt;docker push $IMAGE&lt;/span&gt;

  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ inputs.environment }}&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy to ECS&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;aws ecs update-service \&lt;/span&gt;
            &lt;span class="s"&gt;--cluster ${{ inputs.environment }}-cluster \&lt;/span&gt;
            &lt;span class="s"&gt;--service ${{ inputs.service-name }} \&lt;/span&gt;
            &lt;span class="s"&gt;--force-new-deployment&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Calling a Reusable Workflow
&lt;/h3&gt;

&lt;p&gt;Each service repository calls the shared workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/ci.yml in service repo&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CI/CD&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy-staging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;org/shared-workflows/.github/workflows/build-and-deploy.yml@v2.1.0&lt;/span&gt;
    &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;service-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;payment-api&lt;/span&gt;
      &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;staging&lt;/span&gt;
    &lt;span class="na"&gt;secrets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_ACCESS_KEY_ID }}&lt;/span&gt;
      &lt;span class="na"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_SECRET_ACCESS_KEY }}&lt;/span&gt;
      &lt;span class="na"&gt;ECR_REGISTRY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ECR_REGISTRY }}&lt;/span&gt;

  &lt;span class="na"&gt;deploy-production&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy-staging&lt;/span&gt;
    &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;org/shared-workflows/.github/workflows/build-and-deploy.yml@v2.1.0&lt;/span&gt;
    &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;service-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;payment-api&lt;/span&gt;
      &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;production&lt;/span&gt;
    &lt;span class="na"&gt;secrets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AWS_ACCESS_KEY_ID&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_ACCESS_KEY_ID }}&lt;/span&gt;
      &lt;span class="na"&gt;AWS_SECRET_ACCESS_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_SECRET_ACCESS_KEY }}&lt;/span&gt;
      &lt;span class="na"&gt;ECR_REGISTRY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ECR_REGISTRY }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pin the reusable workflow to a specific tag (&lt;code&gt;@v2.1.0&lt;/code&gt;) so that upstream changes do not break your pipelines unexpectedly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Self-Hosted Runners: Performance and Cost Control
&lt;/h2&gt;

&lt;p&gt;GitHub-hosted runners are convenient but come with limitations: fixed hardware specs, cold start times, no persistent caches, and costs that scale linearly with usage. Self-hosted runners solve these problems.&lt;/p&gt;

&lt;h3&gt;
  
  
  When Self-Hosted Runners Make Sense
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Large Docker builds&lt;/strong&gt; that benefit from persistent layer caches&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPU workloads&lt;/strong&gt; for ML model training or testing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network-restricted environments&lt;/strong&gt; where jobs need access to internal resources&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High CI volume&lt;/strong&gt; where GitHub-hosted runner costs exceed $1,000/month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specialized hardware&lt;/strong&gt; requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Setting Up Ephemeral Self-Hosted Runners on AWS
&lt;/h3&gt;

&lt;p&gt;Use the &lt;code&gt;actions-runner-controller&lt;/code&gt; (ARC) to autoscale runners on Kubernetes, or use EC2 with a simpler setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Terraform for self-hosted runner ASG&lt;/span&gt;
&lt;span class="s"&gt;resource "aws_launch_template" "runner" {&lt;/span&gt;
  &lt;span class="s"&gt;name_prefix   = "github-runner-"&lt;/span&gt;
  &lt;span class="s"&gt;image_id      = data.aws_ami.ubuntu.id&lt;/span&gt;
  &lt;span class="s"&gt;instance_type = "c6i.2xlarge"&lt;/span&gt;

  &lt;span class="s"&gt;user_data = base64encode(templatefile("runner-init.sh", {&lt;/span&gt;
    &lt;span class="s"&gt;github_org    = var.github_org&lt;/span&gt;
    &lt;span class="s"&gt;runner_token  = var.runner_registration_token&lt;/span&gt;
    &lt;span class="s"&gt;runner_labels = "self-hosted,linux,x64,large"&lt;/span&gt;
  &lt;span class="s"&gt;}))&lt;/span&gt;

  &lt;span class="s"&gt;block_device_mappings {&lt;/span&gt;
    &lt;span class="s"&gt;device_name = "/dev/sda1"&lt;/span&gt;
    &lt;span class="s"&gt;ebs {&lt;/span&gt;
      &lt;span class="s"&gt;volume_size = &lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;
      &lt;span class="s"&gt;volume_type = "gp3"&lt;/span&gt;
    &lt;span class="s"&gt;}&lt;/span&gt;
  &lt;span class="s"&gt;}&lt;/span&gt;
&lt;span class="err"&gt;}&lt;/span&gt;

&lt;span class="s"&gt;resource "aws_autoscaling_group" "runners" {&lt;/span&gt;
  &lt;span class="s"&gt;desired_capacity = &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="s"&gt;max_size         = &lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;
  &lt;span class="s"&gt;min_size         = &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;

  &lt;span class="s"&gt;launch_template {&lt;/span&gt;
    &lt;span class="s"&gt;id      = aws_launch_template.runner.id&lt;/span&gt;
    &lt;span class="s"&gt;version = "$Latest"&lt;/span&gt;
  &lt;span class="s"&gt;}&lt;/span&gt;

  &lt;span class="s"&gt;tag {&lt;/span&gt;
    &lt;span class="s"&gt;key                 = "Purpose"&lt;/span&gt;
    &lt;span class="s"&gt;value               = "github-actions-runner"&lt;/span&gt;
    &lt;span class="s"&gt;propagate_at_launch = &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="s"&gt;}&lt;/span&gt;
&lt;span class="err"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using Runner Labels for Job Routing
&lt;/h3&gt;

&lt;p&gt;Target specific runners using labels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;unit-tests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Fast tests on GitHub-hosted runners&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm test&lt;/span&gt;

  &lt;span class="na"&gt;docker-build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Heavy builds on self-hosted with Docker cache&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;self-hosted&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;linux&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;large&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker build -t myapp:latest .&lt;/span&gt;

  &lt;span class="na"&gt;gpu-tests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# ML tests on GPU runners&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;self-hosted&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;gpu&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python -m pytest tests/ml/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Secrets Management Patterns
&lt;/h2&gt;

&lt;p&gt;Secrets handling in CI/CD is a top security concern. GitHub Actions provides several mechanisms, but you need to layer them correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Environment-Scoped Secrets
&lt;/h3&gt;

&lt;p&gt;Use GitHub environments to scope secrets to specific deployment targets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;production&lt;/span&gt;  &lt;span class="c1"&gt;# Only secrets defined in "production" environment are available&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;DB_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.DB_PASSWORD }}&lt;/span&gt;  &lt;span class="c1"&gt;# Production-specific secret&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./deploy.sh&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configure required reviewers on the production environment so deployments require manual approval.&lt;/p&gt;

&lt;h3&gt;
  
  
  OIDC Authentication (No Long-Lived Keys)
&lt;/h3&gt;

&lt;p&gt;Eliminate static AWS credentials entirely using OpenID Connect:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::123456789012:role/GitHubActionsRole&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eu-west-1&lt;/span&gt;
          &lt;span class="c1"&gt;# No access key or secret needed - uses OIDC token exchange&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The IAM role trust policy restricts which repositories and branches can assume it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Federated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRoleWithWebIdentity"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"StringEquals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"token.actions.githubusercontent.com:aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"StringLike"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"token.actions.githubusercontent.com:sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"repo:yourorg/yourrepo:ref:refs/heads/main"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Secrets Scanning and Rotation
&lt;/h3&gt;

&lt;p&gt;Add a workflow that scans for accidentally committed secrets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;secrets-scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;fetch-depth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;trufflesecurity/trufflehog@main&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;extra_args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;--only-verified&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Caching Strategies for Faster Builds
&lt;/h2&gt;

&lt;p&gt;Effective caching can cut build times by 50-80%.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependency Caching
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/cache@v4&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;~/.npm&lt;/span&gt;
      &lt;span class="s"&gt;node_modules&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deps-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}&lt;/span&gt;
    &lt;span class="na"&gt;restore-keys&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;deps-${{ runner.os }}-&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Docker Layer Caching
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker/build-push-action@v5&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
    &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp:latest&lt;/span&gt;
    &lt;span class="na"&gt;cache-from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;type=gha&lt;/span&gt;
    &lt;span class="na"&gt;cache-to&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;type=gha,mode=max&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Artifact Passing Between Jobs
&lt;/h3&gt;

&lt;p&gt;Use artifacts to pass build outputs between jobs without rebuilding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run build&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/upload-artifact@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build-output&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dist/&lt;/span&gt;
          &lt;span class="na"&gt;retention-days&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;

  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/download-artifact@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build-output&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dist/&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./deploy.sh dist/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Workflow Organization at Scale
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Path-Based Triggers
&lt;/h3&gt;

&lt;p&gt;Only run workflows when relevant files change:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;services/payment-api/**'&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.github/workflows/payment-api.yml'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Concurrency Control
&lt;/h3&gt;

&lt;p&gt;Prevent redundant workflow runs when commits are pushed rapidly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;concurrency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy-${{ github.ref }}&lt;/span&gt;
  &lt;span class="na"&gt;cancel-in-progress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This cancels the previous run when a new commit is pushed to the same branch, saving runner minutes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Composite Actions for Shared Steps
&lt;/h3&gt;

&lt;p&gt;When full reusable workflows are overkill, composite actions bundle common steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/actions/setup-project/action.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Setup Project&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install dependencies and configure environment&lt;/span&gt;
&lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;20'&lt;/span&gt;
&lt;span class="na"&gt;runs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;using&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;composite&lt;/span&gt;
  &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ inputs.node-version }}&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/cache@v4&lt;/span&gt;
      &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node_modules&lt;/span&gt;
        &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deps-${{ hashFiles('package-lock.json') }}&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;
      &lt;span class="na"&gt;shell&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bash&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use it in any workflow with a single line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./.github/actions/setup-project&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;20'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Monitoring and Debugging Workflows
&lt;/h2&gt;

&lt;p&gt;When workflows fail in production, you need fast diagnosis. GitHub provides several tools for this.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step-Level Timing Analysis
&lt;/h3&gt;

&lt;p&gt;Every workflow run shows timing per step. If your pipeline is slow, look for steps that take disproportionate time. Common culprits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Checkout with full history&lt;/strong&gt; (&lt;code&gt;fetch-depth: 0&lt;/code&gt;) on large repos - use shallow clones unless you need full history&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker builds without caching&lt;/strong&gt; - always use BuildKit cache with &lt;code&gt;cache-from&lt;/code&gt; and &lt;code&gt;cache-to&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NPM/pip installs without cache&lt;/strong&gt; - dependency caching alone can save 30-60 seconds per run&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Workflow Run Logs and Annotations
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;::warning&lt;/code&gt; and &lt;code&gt;::error&lt;/code&gt; workflow commands to surface issues directly in PR annotations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Check bundle size&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;SIZE=$(stat -f%z dist/bundle.js)&lt;/span&gt;
    &lt;span class="s"&gt;if [ "$SIZE" -gt 500000 ]; then&lt;/span&gt;
      &lt;span class="s"&gt;echo "::warning file=dist/bundle.js::Bundle size is ${SIZE} bytes (over 500KB limit)"&lt;/span&gt;
    &lt;span class="s"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Reusable Debug Workflow
&lt;/h3&gt;

&lt;p&gt;Create a debug composite action that dumps useful context when things go wrong:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Debug info&lt;/span&gt;
  &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;failure()&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;echo "## Environment" &amp;gt;&amp;gt; $GITHUB_STEP_SUMMARY&lt;/span&gt;
    &lt;span class="s"&gt;echo "- Runner: ${{ runner.os }} ${{ runner.arch }}" &amp;gt;&amp;gt; $GITHUB_STEP_SUMMARY&lt;/span&gt;
    &lt;span class="s"&gt;echo "- Node: $(node --version)" &amp;gt;&amp;gt; $GITHUB_STEP_SUMMARY&lt;/span&gt;
    &lt;span class="s"&gt;echo "- Disk: $(df -h / | tail -1)" &amp;gt;&amp;gt; $GITHUB_STEP_SUMMARY&lt;/span&gt;
    &lt;span class="s"&gt;echo "- Memory: $(free -h 2&amp;gt;/dev/null || vm_stat)" &amp;gt;&amp;gt; $GITHUB_STEP_SUMMARY&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Common Pitfalls to Avoid
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Not pinning action versions.&lt;/strong&gt; Using &lt;code&gt;actions/checkout@main&lt;/code&gt; means your workflow can break without any change on your side. Always pin to a specific version tag or commit SHA for third-party actions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Leaking secrets in logs.&lt;/strong&gt; GitHub automatically masks secrets in logs, but derived values (like a URL containing a token) are not masked. Use &lt;code&gt;::add-mask::&lt;/code&gt; for any sensitive derived values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ignoring workflow permissions.&lt;/strong&gt; The default &lt;code&gt;GITHUB_TOKEN&lt;/code&gt; has broad permissions. Use the &lt;code&gt;permissions&lt;/code&gt; key to restrict to only what each job needs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
  &lt;span class="na"&gt;pull-requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Running everything on push and pull_request.&lt;/strong&gt; This causes duplicate runs. Use &lt;code&gt;push&lt;/code&gt; for deploys (main branch only) and &lt;code&gt;pull_request&lt;/code&gt; for checks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not using job summaries.&lt;/strong&gt; The &lt;code&gt;$GITHUB_STEP_SUMMARY&lt;/code&gt; file lets you write rich Markdown summaries that appear in the workflow run UI - much better than scrolling through raw logs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Need Help with Your DevOps?
&lt;/h2&gt;

&lt;p&gt;Setting up production-grade CI/CD pipelines with proper security, caching, and scaling takes real-world experience. At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we build and maintain CI/CD infrastructure for startups and growing teams so your developers can focus on shipping features.&lt;/p&gt;

&lt;p&gt;Plans start at &lt;strong&gt;$2,999/mo&lt;/strong&gt; for a dedicated fractional DevOps engineer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a free 15-minute consultation&lt;/a&gt;&lt;/strong&gt; to optimize your GitHub Actions workflows.&lt;/p&gt;

</description>
      <category>github</category>
      <category>cicd</category>
      <category>devops</category>
      <category>automation</category>
    </item>
    <item>
      <title>Terraform Modules Done Right: Mono-Repo, Versioning, and Registry Patterns</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Mon, 13 Apr 2026 13:46:36 +0000</pubDate>
      <link>https://forem.com/instadevops/terraform-modules-done-right-mono-repo-versioning-and-registry-patterns-34g6</link>
      <guid>https://forem.com/instadevops/terraform-modules-done-right-mono-repo-versioning-and-registry-patterns-34g6</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;As your Terraform codebase grows beyond a handful of resources, you inevitably face a structural decision: how do you organize reusable infrastructure components so that multiple teams can collaborate without stepping on each other's toes?&lt;/p&gt;

&lt;p&gt;The answer is Terraform modules. But writing a module is easy - organizing, versioning, and distributing them at scale is where most teams struggle. Poorly structured modules lead to copy-paste drift, version conflicts, and infrastructure that nobody fully understands.&lt;/p&gt;

&lt;p&gt;In this guide, we will walk through battle-tested patterns for organizing Terraform modules, choosing between mono-repo and multi-repo strategies, leveraging module registries, and implementing versioning that actually works in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes a Good Terraform Module
&lt;/h2&gt;

&lt;p&gt;Before discussing organization, let us establish what a well-designed module looks like. A good Terraform module follows these principles:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single responsibility.&lt;/strong&gt; Each module should manage one logical piece of infrastructure. A VPC module should not also create RDS instances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sensible defaults with full override capability.&lt;/strong&gt; Provide defaults that work for 80% of use cases, but allow every significant parameter to be overridden:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"instance_type"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"EC2 instance type for the application servers"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"t3.medium"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"enable_enhanced_monitoring"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Enable enhanced monitoring with 60-second granularity"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;bool&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"tags"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Additional tags to apply to all resources"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Clear input/output contracts.&lt;/strong&gt; Every variable should have a description, type constraint, and validation where appropriate. Every output that downstream consumers need should be explicitly exported:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"vpc_cidr"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"CIDR block for the VPC"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;

  &lt;span class="nx"&gt;validation&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;condition&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;can&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cidrhost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc_cidr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nx"&gt;error_message&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Must be a valid CIDR block."&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"vpc_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ID of the created VPC"&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"private_subnet_ids"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"List of private subnet IDs"&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_subnet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;private&lt;/span&gt;&lt;span class="p"&gt;[*].&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Minimal provider assumptions.&lt;/strong&gt; Do not hardcode provider configurations inside modules. Let the caller configure the provider:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad - hardcoded provider in module&lt;/span&gt;
&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"aws"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Good - module relies on caller's provider configuration&lt;/span&gt;
&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/aws"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&amp;gt;= 5.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Mono-Repo vs Multi-Repo: Choosing the Right Strategy
&lt;/h2&gt;

&lt;p&gt;This is the most debated structural decision in Terraform module management. Both approaches have legitimate trade-offs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mono-Repo Pattern
&lt;/h3&gt;

&lt;p&gt;All modules live in a single repository with a directory structure like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;terraform-modules/
├── modules/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── README.md
│   ├── ecs-service/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── README.md
│   ├── rds-postgres/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── README.md
│   └── s3-bucket/
│       ├── main.tf
│       ├── variables.tf
│       ├── outputs.tf
│       └── README.md
├── examples/
│   ├── complete-vpc/
│   └── ecs-with-rds/
├── tests/
│   ├── vpc_test.go
│   └── ecs_test.go
└── .github/
    └── workflows/
        └── test.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Advantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Atomic cross-module changes in a single PR&lt;/li&gt;
&lt;li&gt;Unified CI/CD pipeline for testing&lt;/li&gt;
&lt;li&gt;Easier to maintain consistency across modules&lt;/li&gt;
&lt;li&gt;Simpler onboarding for new team members&lt;/li&gt;
&lt;li&gt;One place to search for all infrastructure patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Disadvantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Git tags version the entire repo, not individual modules&lt;/li&gt;
&lt;li&gt;Large repos can slow down CI runs&lt;/li&gt;
&lt;li&gt;Permission boundaries are harder (everyone can see everything)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to use:&lt;/strong&gt; Teams under 20 engineers, organizations with fewer than 30 modules, or when most modules are tightly coupled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-Repo Pattern
&lt;/h3&gt;

&lt;p&gt;Each module gets its own repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;terraform-module-vpc/          (v2.3.1)
terraform-module-ecs-service/  (v1.8.0)
terraform-module-rds-postgres/ (v3.1.0)
terraform-module-s3-bucket/    (v1.2.4)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Advantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Independent versioning per module using Git tags&lt;/li&gt;
&lt;li&gt;Fine-grained access control per repository&lt;/li&gt;
&lt;li&gt;Smaller, focused CI/CD pipelines&lt;/li&gt;
&lt;li&gt;Clear ownership boundaries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Disadvantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cross-module changes require multiple PRs&lt;/li&gt;
&lt;li&gt;Dependency management becomes complex&lt;/li&gt;
&lt;li&gt;More repositories to maintain&lt;/li&gt;
&lt;li&gt;Harder to ensure consistency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When to use:&lt;/strong&gt; Organizations with 50+ modules, multiple platform teams owning different modules, or strict compliance requirements needing audit trails per component.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Hybrid Approach
&lt;/h3&gt;

&lt;p&gt;Many mature organizations use a hybrid: a mono-repo for foundational modules maintained by the platform team, with separate repos for domain-specific modules owned by product teams:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;platform-terraform-modules/     # Platform team owns VPC, IAM, networking
├── modules/vpc/
├── modules/iam-roles/
└── modules/cloudfront/

team-payments-terraform/        # Payments team owns their service modules
├── modules/payment-service/
└── modules/fraud-detection/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Module Versioning Strategies
&lt;/h2&gt;

&lt;p&gt;Versioning is where most Terraform setups break down. Without proper versioning, a module change can silently break every environment that references it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semantic Versioning
&lt;/h3&gt;

&lt;p&gt;Follow semver strictly for modules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MAJOR&lt;/strong&gt; (v2.0.0): Breaking changes (removed variables, renamed resources that force replacement)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MINOR&lt;/strong&gt; (v1.3.0): New features, new optional variables with defaults&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PATCH&lt;/strong&gt; (v1.2.1): Bug fixes, documentation updates&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Pinning Module Versions
&lt;/h3&gt;

&lt;p&gt;Always pin module versions in your root configurations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Good - pinned to exact version&lt;/span&gt;
&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"git::https://github.com/yourorg/terraform-modules.git//modules/vpc?ref=v2.3.1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Acceptable - pinned to minor version range&lt;/span&gt;
&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app.terraform.io/yourorg/vpc/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 2.3"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Bad - no version pin, uses latest&lt;/span&gt;
&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"git::https://github.com/yourorg/terraform-modules.git//modules/vpc"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Automated Version Bumping
&lt;/h3&gt;

&lt;p&gt;Use a CI workflow that automatically creates releases when module directories change:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/release.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Release Modules&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;detect-changes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;changed_modules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ steps.changes.outputs.modules }}&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;fetch-depth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;changes&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;CHANGED=$(git diff --name-only HEAD~1 HEAD | grep '^modules/' | cut -d'/' -f2 | sort -u | jq -R -s -c 'split("\n")[:-1]')&lt;/span&gt;
          &lt;span class="s"&gt;echo "modules=$CHANGED" &amp;gt;&amp;gt; $GITHUB_OUTPUT&lt;/span&gt;

  &lt;span class="na"&gt;release&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;detect-changes&lt;/span&gt;
    &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;needs.detect-changes.outputs.changed_modules != '[]'&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;module&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ fromJson(needs.detect-changes.outputs.changed_modules) }}&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Determine version bump&lt;/span&gt;
        &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;version&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;CURRENT=$(git tag -l "modules/${{ matrix.module }}/v*" | sort -V | tail -1)&lt;/span&gt;
          &lt;span class="s"&gt;# Parse commit messages for bump type&lt;/span&gt;
          &lt;span class="s"&gt;if git log --oneline HEAD~1..HEAD | grep -q "BREAKING"; then&lt;/span&gt;
            &lt;span class="s"&gt;BUMP="major"&lt;/span&gt;
          &lt;span class="s"&gt;elif git log --oneline HEAD~1..HEAD | grep -q "feat"; then&lt;/span&gt;
            &lt;span class="s"&gt;BUMP="minor"&lt;/span&gt;
          &lt;span class="s"&gt;else&lt;/span&gt;
            &lt;span class="s"&gt;BUMP="patch"&lt;/span&gt;
          &lt;span class="s"&gt;fi&lt;/span&gt;
          &lt;span class="s"&gt;echo "bump=$BUMP" &amp;gt;&amp;gt; $GITHUB_OUTPUT&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create release tag&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;# Bump version and create tag&lt;/span&gt;
          &lt;span class="s"&gt;git tag "modules/${{ matrix.module }}/v${NEW_VERSION}"&lt;/span&gt;
          &lt;span class="s"&gt;git push --tags&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Using a Private Module Registry
&lt;/h2&gt;

&lt;p&gt;A module registry provides a discoverable, versioned catalog of your organization's modules. You have several options.&lt;/p&gt;

&lt;h3&gt;
  
  
  Terraform Cloud / HCP Terraform Registry
&lt;/h3&gt;

&lt;p&gt;The simplest option if you are already using Terraform Cloud:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app.terraform.io/yourorg/vpc/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2.3.1"&lt;/span&gt;

  &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.0.0/16"&lt;/span&gt;
  &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Publishing is automatic when you connect your VCS repository to the registry.&lt;/p&gt;

&lt;h3&gt;
  
  
  Self-Hosted with Artifactory or S3
&lt;/h3&gt;

&lt;p&gt;For air-gapped or highly regulated environments, you can host modules on S3:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"s3::https://my-terraform-modules.s3.amazonaws.com/vpc/v2.3.1.zip"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pair this with a CI pipeline that packages and uploads module archives on release:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;MODULE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;
&lt;span class="nv"&gt;VERSION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;

&lt;span class="nb"&gt;cd &lt;/span&gt;modules/&lt;span class="nv"&gt;$MODULE&lt;/span&gt;
zip &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;"/tmp/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;MODULE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.zip"&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="s2"&gt;"/tmp/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;MODULE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;-&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.zip"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"s3://my-terraform-modules/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;MODULE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.zip"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GitHub Releases as a Registry
&lt;/h3&gt;

&lt;p&gt;A lightweight approach using Git tags directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"git::https://github.com/yourorg/terraform-modules.git//modules/vpc?ref=v2.3.1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works well for smaller organizations and avoids the overhead of a dedicated registry.&lt;/p&gt;

&lt;h2&gt;
  
  
  Module Testing and Validation
&lt;/h2&gt;

&lt;p&gt;Modules without tests are modules you cannot trust. Here are the layers of testing you should implement.&lt;/p&gt;

&lt;h3&gt;
  
  
  Static Analysis
&lt;/h3&gt;

&lt;p&gt;Run these on every PR:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Format check&lt;/span&gt;
terraform &lt;span class="nb"&gt;fmt&lt;/span&gt; &lt;span class="nt"&gt;-check&lt;/span&gt; &lt;span class="nt"&gt;-recursive&lt;/span&gt; modules/

&lt;span class="c"&gt;# Validation&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="nb"&gt;dir &lt;/span&gt;&lt;span class="k"&gt;in &lt;/span&gt;modules/&lt;span class="k"&gt;*&lt;/span&gt;/&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  terraform init &lt;span class="nt"&gt;-backend&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false
  &lt;/span&gt;terraform validate
  &lt;span class="nb"&gt;cd&lt;/span&gt; ../..
&lt;span class="k"&gt;done&lt;/span&gt;

&lt;span class="c"&gt;# Security scanning with tfsec&lt;/span&gt;
tfsec modules/

&lt;span class="c"&gt;# Linting with tflint&lt;/span&gt;
tflint &lt;span class="nt"&gt;--recursive&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Integration Testing with Terratest
&lt;/h3&gt;

&lt;p&gt;Write Go tests that actually provision and destroy infrastructure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"testing"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gruntwork-io/terratest/modules/terraform"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/stretchr/testify/assert"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestVpcModule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;terraformOptions&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithDefaultRetryableErrors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Options&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;TerraformDir&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"../modules/vpc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Vars&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}{&lt;/span&gt;
            &lt;span class="s"&gt;"cidr_block"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="s"&gt;"10.99.0.0/16"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"environment"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"test"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;        &lt;span class="s"&gt;"terratest-vpc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;terraformOptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InitAndApply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;terraformOptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;vpcId&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;terraformOptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"vpc_id"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NotEmpty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vpcId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;privateSubnets&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OutputList&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;terraformOptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"private_subnet_ids"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Equal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;privateSubnets&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example Configurations
&lt;/h3&gt;

&lt;p&gt;Every module should ship with a working example in an &lt;code&gt;examples/&lt;/code&gt; directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;modules/vpc/
├── main.tf
├── variables.tf
├── outputs.tf
├── README.md
└── examples/
    ├── simple/
    │   └── main.tf       # Minimal usage
    └── complete/
        └── main.tf       # All features enabled
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Module Composition Patterns
&lt;/h2&gt;

&lt;p&gt;Real infrastructure is built by composing modules together. Here are patterns that work well at scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Root Module Pattern
&lt;/h3&gt;

&lt;p&gt;Create environment-specific root modules that compose shared modules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# environments/production/main.tf&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"network"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app.terraform.io/yourorg/vpc/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2.3.1"&lt;/span&gt;

  &lt;span class="nx"&gt;cidr_block&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.0.0/16"&lt;/span&gt;
  &lt;span class="nx"&gt;availability_zones&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"eu-west-1b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"eu-west-1c"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;environment&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"database"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app.terraform.io/yourorg/rds-postgres/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"3.1.0"&lt;/span&gt;

  &lt;span class="nx"&gt;vpc_id&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;network&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc_id&lt;/span&gt;
  &lt;span class="nx"&gt;subnet_ids&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;network&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;private_subnet_ids&lt;/span&gt;
  &lt;span class="nx"&gt;instance_class&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"db.r6g.xlarge"&lt;/span&gt;
  &lt;span class="nx"&gt;multi_az&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;environment&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"application"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app.terraform.io/yourorg/ecs-service/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1.8.0"&lt;/span&gt;

  &lt;span class="nx"&gt;vpc_id&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;network&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc_id&lt;/span&gt;
  &lt;span class="nx"&gt;subnet_ids&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;network&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;private_subnet_ids&lt;/span&gt;
  &lt;span class="nx"&gt;lb_target_group&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;network&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;alb_target_group_arn&lt;/span&gt;
  &lt;span class="nx"&gt;database_endpoint&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;
  &lt;span class="nx"&gt;desired_count&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;
  &lt;span class="nx"&gt;environment&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Terragrunt DRY Pattern
&lt;/h3&gt;

&lt;p&gt;For organizations managing many environments, Terragrunt eliminates repetition:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# terragrunt.hcl (root)&lt;/span&gt;
&lt;span class="nx"&gt;remote_state&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"s3"&lt;/span&gt;
  &lt;span class="nx"&gt;generate&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;path&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"backend.tf"&lt;/span&gt;
    &lt;span class="nx"&gt;if_exists&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"overwrite_terragrunt"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;bucket&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"myorg-terraform-state"&lt;/span&gt;
    &lt;span class="nx"&gt;key&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${path_relative_to_include()}/terraform.tfstate"&lt;/span&gt;
    &lt;span class="nx"&gt;region&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"eu-west-1"&lt;/span&gt;
    &lt;span class="nx"&gt;encrypt&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;dynamodb_table&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-locks"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# environments/production/vpc/terragrunt.hcl&lt;/span&gt;
&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"git::https://github.com/yourorg/terraform-modules.git//modules/vpc?ref=v2.3.1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;inputs&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.0.0/16"&lt;/span&gt;
  &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Common Pitfalls and How to Avoid Them
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Over-abstracting too early.&lt;/strong&gt; Do not create a module until you have written the same Terraform code at least twice. Premature abstraction creates rigid modules that fight against real requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nested modules more than two levels deep.&lt;/strong&gt; Module A calls module B which calls module C is already hard to debug. Keep your module hierarchy shallow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not using &lt;code&gt;moved&lt;/code&gt; blocks during refactors.&lt;/strong&gt; When restructuring modules, use &lt;code&gt;moved&lt;/code&gt; blocks to prevent Terraform from destroying and recreating resources:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;moved&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;from&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;web&lt;/span&gt;
  &lt;span class="nx"&gt;to&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;application&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;web&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Ignoring module documentation.&lt;/strong&gt; Every module should have a README generated by &lt;code&gt;terraform-docs&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate docs automatically&lt;/span&gt;
terraform-docs markdown table modules/vpc/ &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; modules/vpc/README.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Storing secrets in tfvars files.&lt;/strong&gt; Use a secrets manager and data sources instead of committing sensitive values.&lt;/p&gt;

&lt;h2&gt;
  
  
  Need Help with Your DevOps?
&lt;/h2&gt;

&lt;p&gt;Building and maintaining a well-structured Terraform module library takes experience. At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we help startups and growing teams implement production-grade Infrastructure as Code from day one - so you can ship infrastructure changes with the same confidence as application code.&lt;/p&gt;

&lt;p&gt;Plans start at &lt;strong&gt;$2,999/mo&lt;/strong&gt; for a dedicated fractional DevOps engineer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a free 15-minute consultation&lt;/a&gt;&lt;/strong&gt; to discuss your Terraform architecture.&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>iac</category>
      <category>devops</category>
      <category>modules</category>
    </item>
    <item>
      <title>Infrastructure Testing: Terratest, Checkov, and Validating Your IaC</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Sun, 12 Apr 2026 13:46:59 +0000</pubDate>
      <link>https://forem.com/instadevops/infrastructure-testing-terratest-checkov-and-validating-your-iac-2c49</link>
      <guid>https://forem.com/instadevops/infrastructure-testing-terratest-checkov-and-validating-your-iac-2c49</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Infrastructure as Code has transformed cloud management, but when a single Terraform apply can spin up hundreds of resources, testing becomes essential.&lt;/p&gt;

&lt;p&gt;This article explores Terratest for functional testing and Checkov for security scanning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Static Analysis with Checkov
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;checkov
checkov &lt;span class="nt"&gt;-d&lt;/span&gt; ./terraform
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Custom Policies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;checkov.terraform.checks.resource.base_resource_check&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseResourceCheck&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;checkov.common.models.enums&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CheckCategories&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CheckResult&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EC2HasEnvironmentTag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseResourceCheck&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ensure EC2 instances have environment tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CKV_CUSTOM_1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;supported_resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;aws_instance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;supported_resources&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;supported_resources&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scan_resource_conf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conf&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tags&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[{}])[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Environment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;CheckResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PASSED&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;CheckResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FAILED&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Functional Testing with Terratest
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestS3Bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parallel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;terraformOptions&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithDefaultRetryableErrors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Options&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;TerraformDir&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"../modules/s3-bucket"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Vars&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}{&lt;/span&gt;
                &lt;span class="s"&gt;"bucket_name"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"test-bucket-"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UniqueId&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Destroy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;terraformOptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InitAndApply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;terraformOptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;bucketID&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;terraformOptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"bucket_id"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;actualStatus&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetS3BucketVersioning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"us-east-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bucketID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Equal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;actualStatus&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When to Use Each Tool
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Checkov&lt;/th&gt;
&lt;th&gt;Terratest&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Type&lt;/td&gt;
&lt;td&gt;Static analysis&lt;/td&gt;
&lt;td&gt;Functional testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Seconds&lt;/td&gt;
&lt;td&gt;Minutes to hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;Incurs cloud costs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coverage&lt;/td&gt;
&lt;td&gt;Security, compliance&lt;/td&gt;
&lt;td&gt;Actual functionality&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  CI/CD Integration
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;static-analysis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridgecrewio/checkov-action@master&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;directory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;terraform/&lt;/span&gt;

  &lt;span class="na"&gt;terratest&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;static-analysis&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;cd test&lt;/span&gt;
          &lt;span class="s"&gt;go test -v -timeout 30m ./...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Layer your testing: Start with Checkov for fast feedback, add Terratest for critical modules, and run full integration tests before production.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help with Your DevOps Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we specialize in helping startups build production-ready infrastructure.&lt;/p&gt;

&lt;p&gt;📅 &lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a Free 15-Min Consultation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://instadevops.com/blog/infrastructure-testing" rel="noopener noreferrer"&gt;instadevops.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>testing</category>
      <category>iac</category>
      <category>devops</category>
    </item>
    <item>
      <title>Kubernetes Networking Demystified: Services, Ingress, and Network Policies</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Sat, 11 Apr 2026 13:46:57 +0000</pubDate>
      <link>https://forem.com/instadevops/kubernetes-networking-demystified-services-ingress-and-network-policies-43ne</link>
      <guid>https://forem.com/instadevops/kubernetes-networking-demystified-services-ingress-and-network-policies-43ne</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Kubernetes networking is often cited as one of the most challenging aspects of container orchestration. This guide covers the three pillars: Services, Ingress, and Network Policies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Services: Stable Endpoints for Dynamic Pods
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ClusterIP Services
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend-api&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterIP&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
    &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  LoadBalancer Services
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;public-api&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LoadBalancer&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;443&lt;/span&gt;
    &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8443&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Ingress: HTTP/HTTPS Traffic Management
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ingress&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app-ingress&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ingressClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
  &lt;span class="na"&gt;tls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;app.example.com&lt;/span&gt;
    &lt;span class="na"&gt;secretName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app-tls-secret&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app.example.com&lt;/span&gt;
    &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/api&lt;/span&gt;
        &lt;span class="na"&gt;pathType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Prefix&lt;/span&gt;
        &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend-api&lt;/span&gt;
            &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Network Policies: Securing Pod Communication
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Default Deny All
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NetworkPolicy&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default-deny-all&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;podSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
  &lt;span class="na"&gt;policyTypes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Ingress&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;Egress&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Allow Specific Ingress
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NetworkPolicy&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;allow-frontend-to-backend&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;podSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend&lt;/span&gt;
  &lt;span class="na"&gt;ingress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;podSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;frontend&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
      &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8080&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Troubleshooting
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check endpoints&lt;/span&gt;
kubectl get endpoints my-service

&lt;span class="c"&gt;# Test DNS&lt;/span&gt;
kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; my-pod &lt;span class="nt"&gt;--&lt;/span&gt; nslookup kubernetes.default

&lt;span class="c"&gt;# Debug with netshoot&lt;/span&gt;
kubectl run netshoot &lt;span class="nt"&gt;--image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nicolaka/netshoot &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Master Services, Ingress, and Network Policies, and you'll have a solid foundation for building secure, scalable applications on Kubernetes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help with Your DevOps Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we specialize in helping startups build production-ready infrastructure.&lt;/p&gt;

&lt;p&gt;📅 &lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a Free 15-Min Consultation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://instadevops.com/blog/kubernetes-networking" rel="noopener noreferrer"&gt;instadevops.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>networking</category>
      <category>devops</category>
      <category>security</category>
    </item>
    <item>
      <title>Container Registry Best Practices: ECR, Docker Hub, and Self-Hosted Options</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Fri, 10 Apr 2026 13:46:54 +0000</pubDate>
      <link>https://forem.com/instadevops/container-registry-best-practices-ecr-docker-hub-and-self-hosted-options-22kh</link>
      <guid>https://forem.com/instadevops/container-registry-best-practices-ecr-docker-hub-and-self-hosted-options-22kh</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Container registries are the backbone of containerized application deployment. Choosing the right registry and implementing proper practices can mean the difference between smooth deployments and security nightmares.&lt;/p&gt;

&lt;h2&gt;
  
  
  Amazon ECR: AWS-Native Registry
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a repository&lt;/span&gt;
aws ecr create-repository &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--repository-name&lt;/span&gt; my-app &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--image-scanning-configuration&lt;/span&gt; &lt;span class="nv"&gt;scanOnPush&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Push an image&lt;/span&gt;
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ECR Lifecycle Policies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rules"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"rulePriority"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Keep last 10 production images"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"selection"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tagStatus"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tagged"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tagPrefixList"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"prod-"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"countType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"imageCountMoreThan"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"countNumber"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"expire"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Docker Hub
&lt;/h2&gt;

&lt;p&gt;Docker Hub remains the most widely used registry, hosting millions of public images.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate Limits&lt;/strong&gt;: Anonymous pulls limited to 100 per 6 hours; authenticated free users get 200.&lt;/p&gt;

&lt;h2&gt;
  
  
  Self-Hosted: Harbor
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;helm install harbor harbor/harbor \&lt;/span&gt;
  &lt;span class="s"&gt;--set expose.type=ingress \&lt;/span&gt;
  &lt;span class="s"&gt;--set expose.ingress.hosts.core=registry.example.com \&lt;/span&gt;
  &lt;span class="s"&gt;--set trivy.enabled=true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Security Best Practices
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Enable image scanning&lt;/li&gt;
&lt;li&gt;Implement least-privilege access&lt;/li&gt;
&lt;li&gt;Sign your images with Cosign&lt;/li&gt;
&lt;li&gt;Use immutable tags&lt;/li&gt;
&lt;li&gt;Scan base images regularly&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Image Tagging Strategies
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;VERSION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"1.2.3"&lt;/span&gt;
&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;git rev-parse &lt;span class="nt"&gt;--short&lt;/span&gt; HEAD&lt;span class="si"&gt;)&lt;/span&gt;

docker build &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-t&lt;/span&gt; my-app:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-t&lt;/span&gt; my-app:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;VERSION&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-t&lt;/span&gt; my-app:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Whether you choose ECR for AWS integration, Docker Hub for ubiquity, or Harbor for control, applying security best practices will keep your container infrastructure secure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help with Your DevOps Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we specialize in helping startups build production-ready infrastructure.&lt;/p&gt;

&lt;p&gt;📅 &lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a Free 15-Min Consultation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://instadevops.com/blog/container-registry-guide" rel="noopener noreferrer"&gt;instadevops.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>docker</category>
      <category>ecr</category>
      <category>containers</category>
      <category>devops</category>
    </item>
    <item>
      <title>Incident Management: Building Effective On-Call Rotations and Runbooks</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Thu, 09 Apr 2026 20:46:50 +0000</pubDate>
      <link>https://forem.com/instadevops/incident-management-building-effective-on-call-rotations-and-runbooks-10ho</link>
      <guid>https://forem.com/instadevops/incident-management-building-effective-on-call-rotations-and-runbooks-10ho</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Every engineering team dreads the 3 AM page. How your team handles these moments defines your reliability as a service provider.&lt;/p&gt;

&lt;p&gt;This guide covers on-call rotation design, runbook creation, incident response processes, and blameless post-mortems.&lt;/p&gt;

&lt;h2&gt;
  
  
  On-Call Best Practices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Designing Sustainable Rotations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rotation Length&lt;/strong&gt;: Weekly rotations work best&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimum Team Size&lt;/strong&gt;: Never fewer than 4 engineers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compensation&lt;/strong&gt;: Fair pay for on-call work&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Reducing Alert Fatigue
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Actionable alerts only&lt;/li&gt;
&lt;li&gt;Consolidate related alerts&lt;/li&gt;
&lt;li&gt;Regular alert reviews&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Runbook Template Example
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Runbook: Database Connection Pool Exhausted&lt;/span&gt;

&lt;span class="gu"&gt;## Impact Assessment&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; User Impact: New requests may fail
&lt;span class="p"&gt;-&lt;/span&gt; Affected Services: All services using primary database

&lt;span class="gu"&gt;## Immediate Actions&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Check current connections:
   SELECT count(&lt;span class="err"&gt;*&lt;/span&gt;), state FROM pg_stat_activity GROUP BY state;
&lt;span class="p"&gt;
2.&lt;/span&gt; Identify connection hogs:
   SELECT usename, application_name, count(&lt;span class="err"&gt;*&lt;/span&gt;)
   FROM pg_stat_activity GROUP BY usename, application_name;

&lt;span class="gu"&gt;## Resolution Steps&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; If long-running query: pg_terminate_backend(&lt;span class="nt"&gt;&amp;lt;pid&amp;gt;&lt;/span&gt;)
&lt;span class="p"&gt;-&lt;/span&gt; If traffic spike: Scale up read replicas
&lt;span class="p"&gt;-&lt;/span&gt; If connection leak: Restart affected service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Incident Response Process
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Phase 1: Detection and Triage (0-5 minutes)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Acknowledge the alert&lt;/li&gt;
&lt;li&gt;Assess severity&lt;/li&gt;
&lt;li&gt;Open relevant runbook&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Phase 2: Response Coordination
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Incident Commander&lt;/strong&gt;: Coordinates response&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technical Lead&lt;/strong&gt;: Hands-on debugging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Communications Lead&lt;/strong&gt;: Updates stakeholders&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Blameless Post-Mortems
&lt;/h2&gt;

&lt;p&gt;Focus on systems and processes, not individuals. The goal is learning and improvement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Effective incident management is built on preparation, not heroics. Well-designed rotations, comprehensive runbooks, and blameless post-mortems transform incident response from stress into a competitive advantage.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help with Your DevOps Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we specialize in helping startups build production-ready infrastructure.&lt;/p&gt;

&lt;p&gt;📅 &lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a Free 15-Min Consultation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://instadevops.com/blog/incident-management" rel="noopener noreferrer"&gt;instadevops.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>incident</category>
      <category>oncall</category>
      <category>sre</category>
      <category>devops</category>
    </item>
    <item>
      <title>SRE Fundamentals: Defining SLOs, SLIs, and Error Budgets That Actually Work</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Thu, 09 Apr 2026 20:33:35 +0000</pubDate>
      <link>https://forem.com/instadevops/sre-fundamentals-defining-slos-slis-and-error-budgets-that-actually-work-42k7</link>
      <guid>https://forem.com/instadevops/sre-fundamentals-defining-slos-slis-and-error-budgets-that-actually-work-42k7</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Site Reliability Engineering (SRE) has transformed how organizations think about system reliability. Central to this framework are Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets.&lt;/p&gt;

&lt;p&gt;This guide will walk you through defining SLOs, SLIs, and Error Budgets that actually drive meaningful improvements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Reliability Hierarchy
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SLAs (Service Level Agreements)&lt;/strong&gt;: External contracts with customers specifying consequences for failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLOs (Service Level Objectives)&lt;/strong&gt;: Internal targets your team commits to, stricter than SLAs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SLIs (Service Level Indicators)&lt;/strong&gt;: The actual measurements determining whether you're meeting SLOs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error Budgets&lt;/strong&gt;: How much unreliability you can tolerate while meeting your SLO.&lt;/p&gt;

&lt;h2&gt;
  
  
  Defining Meaningful SLIs
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Four Golden Signals
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt;: How long requests take&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traffic&lt;/strong&gt;: Request volume&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Errors&lt;/strong&gt;: Rate of failed requests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Saturation&lt;/strong&gt;: How "full" your service is
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Availability SLI
sum(rate(http_requests_total{status=~"2.."}[5m]))
/
sum(rate(http_requests_total[5m]))

# Latency SLI
sum(rate(http_request_duration_seconds_bucket{le="0.2"}[5m]))
/
sum(rate(http_request_duration_seconds_count[5m]))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setting Realistic SLOs
&lt;/h2&gt;

&lt;p&gt;Each additional "nine" dramatically reduces your error budget:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Availability&lt;/th&gt;
&lt;th&gt;Monthly Downtime&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;99%&lt;/td&gt;
&lt;td&gt;7.2 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;99.9%&lt;/td&gt;
&lt;td&gt;43.8 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;99.99%&lt;/td&gt;
&lt;td&gt;4.38 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Error Budgets: The Key to Balance
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error Budget = 1 - SLO

For a 99.9% SLO over 30 days:
Error Budget = 0.1% = 43.2 minutes of downtime
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;SLOs, SLIs, and Error Budgets aren't just metrics—they're a framework for making better decisions about reliability. The goal is appropriate reliability—enough to keep users happy while maintaining velocity.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help with Your DevOps Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we specialize in helping startups build production-ready infrastructure.&lt;/p&gt;

&lt;p&gt;📅 &lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a Free 15-Min Consultation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://instadevops.com/blog/sre-slos-slis" rel="noopener noreferrer"&gt;instadevops.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>sre</category>
      <category>monitoring</category>
      <category>devops</category>
    </item>
    <item>
      <title>Disaster Recovery in the Cloud: RPO, RTO, and Building Resilient Systems</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Wed, 08 Apr 2026 14:15:32 +0000</pubDate>
      <link>https://forem.com/instadevops/disaster-recovery-in-the-cloud-rpo-rto-and-building-resilient-systems-3e5p</link>
      <guid>https://forem.com/instadevops/disaster-recovery-in-the-cloud-rpo-rto-and-building-resilient-systems-3e5p</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Every minute of downtime costs money. For some enterprises, that figure reaches $5,600 per minute. But beyond financial impact, outages erode customer trust and can pose compliance risks.&lt;/p&gt;

&lt;p&gt;This article explores disaster recovery fundamentals, specifically Recovery Point Objective (RPO) and Recovery Time Objective (RTO), and provides practical guidance for building resilient cloud systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding RPO and RTO
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Recovery Point Objective (RPO)
&lt;/h3&gt;

&lt;p&gt;How much data can we afford to lose? RPO represents the maximum acceptable data loss measured in time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recovery Time Objective (RTO)
&lt;/h3&gt;

&lt;p&gt;How quickly must we restore operations? RTO defines maximum acceptable downtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  DR Strategy Tiers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Tier 1: Backup and Restore
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;RPO:&lt;/strong&gt; Hours to days | &lt;strong&gt;RTO:&lt;/strong&gt; Hours to days | &lt;strong&gt;Cost:&lt;/strong&gt; Lowest&lt;/p&gt;

&lt;h3&gt;
  
  
  Tier 2: Pilot Light
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;RPO:&lt;/strong&gt; Minutes to hours | &lt;strong&gt;RTO:&lt;/strong&gt; Hours | &lt;strong&gt;Cost:&lt;/strong&gt; Low to Medium&lt;/p&gt;

&lt;p&gt;Keep core components synchronized, but application servers remain off until needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tier 3: Warm Standby
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;RPO:&lt;/strong&gt; Minutes | &lt;strong&gt;RTO:&lt;/strong&gt; Minutes to hours | &lt;strong&gt;Cost:&lt;/strong&gt; Medium to High&lt;/p&gt;

&lt;p&gt;A scaled-down but fully functional environment runs continuously.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tier 4: Multi-Region Active-Active
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;RPO:&lt;/strong&gt; Near-zero | &lt;strong&gt;RTO:&lt;/strong&gt; Near-zero | &lt;strong&gt;Cost:&lt;/strong&gt; Highest&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_globalaccelerator_accelerator"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production-global"&lt;/span&gt;
  &lt;span class="nx"&gt;ip_address_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"IPV4"&lt;/span&gt;
  &lt;span class="nx"&gt;enabled&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Automating Failover
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_route53_health_check"&lt;/span&gt; &lt;span class="s2"&gt;"primary"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;fqdn&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"api-primary.example.com"&lt;/span&gt;
  &lt;span class="nx"&gt;port&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;443&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"HTTPS"&lt;/span&gt;
  &lt;span class="nx"&gt;failure_threshold&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"3"&lt;/span&gt;
  &lt;span class="nx"&gt;request_interval&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing Your DR Plan
&lt;/h2&gt;

&lt;p&gt;A DR plan that has never been tested is not a plan. Regular testing validates assumptions and trains your team.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building resilient systems requires understanding business requirements, choosing appropriate DR strategies, and relentlessly testing. The cost of preparation is always less than the cost of recovery without a plan.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help with Your DevOps Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we specialize in helping startups build production-ready infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a Free 15-Min Consultation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://instadevops.com/blog/disaster-recovery-cloud" rel="noopener noreferrer"&gt;instadevops.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>disasterrecovery</category>
      <category>cloud</category>
      <category>aws</category>
      <category>devops</category>
    </item>
    <item>
      <title>API Gateway Patterns: Kong, AWS API Gateway, and When to Use Each</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Sun, 05 Apr 2026 13:48:01 +0000</pubDate>
      <link>https://forem.com/instadevops/api-gateway-patterns-kong-aws-api-gateway-and-when-to-use-each-3e3m</link>
      <guid>https://forem.com/instadevops/api-gateway-patterns-kong-aws-api-gateway-and-when-to-use-each-3e3m</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;API gateways have become the backbone of modern microservices architecture. They act as the single entry point for all client requests, handling everything from routing and authentication to rate limiting and load balancing.&lt;/p&gt;

&lt;p&gt;In this article, we'll compare two popular API gateway solutions: &lt;strong&gt;Kong&lt;/strong&gt; (open-source and self-hosted) and &lt;strong&gt;AWS API Gateway&lt;/strong&gt; (fully managed).&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding API Gateway Patterns
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Edge Gateway Pattern&lt;/strong&gt;: A single entry point at the network edge that handles all external traffic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Backend for Frontend (BFF) Pattern&lt;/strong&gt;: Separate gateways optimized for different client types.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sidecar Pattern&lt;/strong&gt;: A gateway deployed alongside each service in service mesh architectures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kong: The Open-Source Powerhouse
&lt;/h2&gt;

&lt;p&gt;Kong is built on NGINX and offers both open-source and enterprise versions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# kong.yaml - Declarative configuration&lt;/span&gt;
&lt;span class="na"&gt;_format_version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3.0"&lt;/span&gt;

&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user-service&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://user-api:3000&lt;/span&gt;
    &lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user-routes&lt;/span&gt;
        &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/api/users&lt;/span&gt;
    &lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rate-limiting&lt;/span&gt;
        &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;minute&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;key-auth&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  AWS API Gateway
&lt;/h2&gt;

&lt;p&gt;AWS API Gateway is fully managed and integrates deeply with AWS services.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_api_gateway_rest_api"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production-api"&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Production API Gateway"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_api_gateway_method"&lt;/span&gt; &lt;span class="s2"&gt;"get_users"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;rest_api_id&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_api_gateway_rest_api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;resource_id&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_api_gateway_resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;http_method&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GET"&lt;/span&gt;
  &lt;span class="nx"&gt;authorization&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"COGNITO_USER_POOLS"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Decision Framework
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Choose Kong When:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Multi-cloud or hybrid deployments&lt;/li&gt;
&lt;li&gt;Advanced customization needs&lt;/li&gt;
&lt;li&gt;Cost optimization at scale&lt;/li&gt;
&lt;li&gt;Avoiding vendor lock-in&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Choose AWS API Gateway When:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;AWS-native architecture&lt;/li&gt;
&lt;li&gt;Minimal operational overhead&lt;/li&gt;
&lt;li&gt;Rapid prototyping&lt;/li&gt;
&lt;li&gt;Deep AWS integration needed&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Both Kong and AWS API Gateway are excellent choices. Kong excels in flexibility and multi-cloud scenarios, while AWS API Gateway shines in AWS-native architectures.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help with Your DevOps Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we specialize in helping startups and scale-ups build production-ready infrastructure.&lt;/p&gt;

&lt;p&gt;📅 &lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a Free 15-Min Consultation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://instadevops.com/blog/api-gateway-patterns" rel="noopener noreferrer"&gt;instadevops.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>kong</category>
      <category>aws</category>
      <category>microservices</category>
    </item>
    <item>
      <title>Kubernetes Cost Optimization: FinOps Strategies for K8s Clusters</title>
      <dc:creator>InstaDevOps</dc:creator>
      <pubDate>Sat, 04 Apr 2026 13:47:59 +0000</pubDate>
      <link>https://forem.com/instadevops/kubernetes-cost-optimization-finops-strategies-for-k8s-clusters-1h46</link>
      <guid>https://forem.com/instadevops/kubernetes-cost-optimization-finops-strategies-for-k8s-clusters-1h46</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Kubernetes has become the de facto standard for container orchestration, but with great power comes great... cloud bills. According to recent studies, organizations waste an average of 30-35% of their Kubernetes spending on over-provisioned or idle resources. This isn't just a technical problem—it's a business problem that directly impacts your bottom line.&lt;/p&gt;

&lt;p&gt;FinOps (Financial Operations) brings financial accountability to the variable spend model of cloud computing. When applied to Kubernetes, FinOps practices help teams understand, manage, and optimize their cluster costs while maintaining the performance and reliability their applications need.&lt;/p&gt;

&lt;p&gt;In this guide, we'll explore practical strategies for optimizing Kubernetes costs, from right-sizing workloads to implementing automated scaling policies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Kubernetes Cost Drivers
&lt;/h2&gt;

&lt;p&gt;Before optimizing costs, you need to understand where your money goes. Kubernetes costs typically break down into several categories:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compute Resources&lt;/strong&gt;: The CPU and memory allocated to your nodes and pods represent the largest portion of most Kubernetes bills.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Storage&lt;/strong&gt;: Persistent volumes, storage classes, and backup solutions add up quickly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Networking&lt;/strong&gt;: Data transfer between availability zones, regions, and the internet can surprise you with unexpected charges.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Control Plane&lt;/strong&gt;: Managed Kubernetes services (EKS, GKE, AKS) charge for the control plane itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Right-Sizing Workloads with Resource Requests and Limits
&lt;/h2&gt;

&lt;p&gt;The foundation of Kubernetes cost optimization is properly configuring resource requests and limits.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-server&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp/api:v1.2.3&lt;/span&gt;
        &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;256Mi"&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;250m"&lt;/span&gt;
          &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;512Mi"&lt;/span&gt;
            &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;500m"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Implementing Vertical Pod Autoscaler (VPA)
&lt;/h2&gt;

&lt;p&gt;VPA automatically adjusts resource requests for pods based on actual usage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;autoscaling.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;VerticalPodAutoscaler&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-server-vpa&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;targetRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
    &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-server&lt;/span&gt;
  &lt;span class="na"&gt;updatePolicy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;updateMode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Auto"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Horizontal Pod Autoscaler for Demand-Based Scaling
&lt;/h2&gt;

&lt;p&gt;HPA adjusts the number of replicas based on demand:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;autoscaling/v2&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HorizontalPodAutoscaler&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-server-hpa&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;scaleTargetRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
    &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-server&lt;/span&gt;
  &lt;span class="na"&gt;minReplicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;maxReplicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20&lt;/span&gt;
  &lt;span class="na"&gt;metrics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Resource&lt;/span&gt;
    &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cpu&lt;/span&gt;
      &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Utilization&lt;/span&gt;
        &lt;span class="na"&gt;averageUtilization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;70&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Leveraging Spot Instances
&lt;/h2&gt;

&lt;p&gt;Spot instances offer 60-90% discounts for suitable workloads:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;batch-processor&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;nodeSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;kubernetes.io/lifecycle&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spot&lt;/span&gt;
      &lt;span class="na"&gt;tolerations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kubernetes.io/lifecycle"&lt;/span&gt;
        &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Equal"&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spot"&lt;/span&gt;
        &lt;span class="na"&gt;effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NoSchedule"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Namespace Resource Quotas
&lt;/h2&gt;

&lt;p&gt;Prevent runaway costs with namespace-level guardrails:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ResourceQuota&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;team-alpha-quota&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;team-alpha&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;hard&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;requests.cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;20"&lt;/span&gt;
    &lt;span class="na"&gt;requests.memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;40Gi"&lt;/span&gt;
    &lt;span class="na"&gt;limits.cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;40"&lt;/span&gt;
    &lt;span class="na"&gt;limits.memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;80Gi"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Best Practices Checklist
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Set resource requests and limits on every container&lt;/li&gt;
&lt;li&gt;Use VPA in recommendation mode first&lt;/li&gt;
&lt;li&gt;Configure HPA with appropriate thresholds&lt;/li&gt;
&lt;li&gt;Enable Cluster Autoscaler with 50% scale-down threshold&lt;/li&gt;
&lt;li&gt;Use spot instances for fault-tolerant workloads&lt;/li&gt;
&lt;li&gt;Implement namespace quotas for accountability&lt;/li&gt;
&lt;li&gt;Deploy cost monitoring tools like Kubecost&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Kubernetes cost optimization isn't a one-time project—it's an ongoing practice. Start with visibility, implement technical controls, and treat cost as a first-class metric alongside availability and performance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Need Help with Your DevOps Infrastructure?
&lt;/h2&gt;

&lt;p&gt;At &lt;strong&gt;&lt;a href="https://instadevops.com" rel="noopener noreferrer"&gt;InstaDevOps&lt;/a&gt;&lt;/strong&gt;, we specialize in helping startups and scale-ups build production-ready infrastructure without the overhead of a full-time DevOps team.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Our Services:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🏗️ &lt;strong&gt;AWS Consulting&lt;/strong&gt; - Cloud architecture, cost optimization, and migration&lt;/li&gt;
&lt;li&gt;☸️ &lt;strong&gt;Kubernetes Management&lt;/strong&gt; - Production-ready clusters and orchestration&lt;/li&gt;
&lt;li&gt;🚀 &lt;strong&gt;CI/CD Pipelines&lt;/strong&gt; - Automated deployment pipelines that just work&lt;/li&gt;
&lt;li&gt;📊 &lt;strong&gt;Monitoring &amp;amp; Observability&lt;/strong&gt; - See what's happening in your infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;📅 &lt;strong&gt;&lt;a href="https://calendly.com/instadevops/15min" rel="noopener noreferrer"&gt;Book a Free 15-Min Consultation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://instadevops.com/blog/kubernetes-cost-optimization" rel="noopener noreferrer"&gt;instadevops.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>finops</category>
      <category>cost</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
