<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Dhruv Chaudhary</title>
    <description>The latest articles on Forem by Dhruv Chaudhary (@dc-shimla).</description>
    <link>https://forem.com/dc-shimla</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3516414%2F76a9e3ff-d7bf-4319-891b-160641d8ff2b.jpg</url>
      <title>Forem: Dhruv Chaudhary</title>
      <link>https://forem.com/dc-shimla</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dc-shimla"/>
    <language>en</language>
    <item>
      <title>Guardrails for AI-Generated IaC: How MyCoCo Made Speed Sustainable</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Sat, 06 Dec 2025 22:24:51 +0000</pubDate>
      <link>https://forem.com/dc-shimla/guardrails-for-ai-generated-iac-how-mycoco-made-speed-sustainable-2f1k</link>
      <guid>https://forem.com/dc-shimla/guardrails-for-ai-generated-iac-how-mycoco-made-speed-sustainable-2f1k</guid>
      <description>&lt;p&gt;AI coding assistants promise to accelerate infrastructure delivery, but organizations are discovering a hidden cost: code that passes syntax validation often fails security audits. &lt;a href="https://arxiv.org/html/2506.05623v1" rel="noopener noreferrer"&gt;Recent research&lt;/a&gt; shows that while AI-generated infrastructure code may look correct, &lt;strong&gt;only 9% meets security compliance standards&lt;/strong&gt;. When MyCoCo's platform team generated dozens of Terraform modules with AI assistance, a security scan revealed a sobering truth—speed without guardrails creates technical debt that compounds with every deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt; AI-generated Terraform passes &lt;code&gt;terraform validate&lt;/code&gt; but fails organizational compliance—missing tags, overly permissive IAM, exposed resources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; Implement OPA-based policy guardrails at the PR level that catch AI blind spots before code reaches production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Impact:&lt;/strong&gt; MyCoCo reduced security findings from &lt;strong&gt;47 to 3&lt;/strong&gt; per AI-generated module while retaining &lt;strong&gt;70%&lt;/strong&gt; of velocity gains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Implementation:&lt;/strong&gt; Custom OPA policies targeting common AI omissions: required tags, encryption enforcement, least-privilege IAM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom Line:&lt;/strong&gt; AI accelerates IaC development, but only with organizational context injected through automated policy enforcement.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fky7zw1pzx8v67gkpy1lw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fky7zw1pzx8v67gkpy1lw.png" alt="OPA policy guardrails bridge the gap between what AI knows (public patterns) and what your organization requires (security, compliance, and governance standards)" width="800" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: MyCoCo's AI Experiment
&lt;/h2&gt;

&lt;p&gt;Jordan, MyCoCo's Platform Engineer, was convinced AI would transform their infrastructure delivery. With a major product launch approaching, the platform team faced an impossible timeline: 30 new Terraform modules in six weeks. Using GitHub Copilot and Claude, Jordan's team produced the modules in just two weeks.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We were shipping infrastructure faster than ever. The AI understood Terraform syntax perfectly. Every module passed validation on the first try."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then Maya, the Security Engineer, ran her pre-production Checkov scan.&lt;/p&gt;

&lt;p&gt;The results stopped the launch cold: &lt;strong&gt;47 security findings per module&lt;/strong&gt; on average. S3 buckets without encryption. Lambda functions with wildcard IAM permissions. And the most painful discovery—not a single resource had MyCoCo's required &lt;code&gt;Environment&lt;/code&gt;, &lt;code&gt;Owner&lt;/code&gt;, or &lt;code&gt;CostCenter&lt;/code&gt; tags.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The AI wrote syntactically perfect Terraform. But it had no idea about our tagging policies, our naming conventions, or our security baseline. It generated code like we were a greenfield startup, not a company preparing for SOC 2."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Sam, the Senior DevOps Engineer, had warned the team from the start. The confidence gap was real—the team trusted AI-generated code more than manually written code, despite having less visibility into its logic.&lt;/p&gt;

&lt;p&gt;Alex, VP of Engineering, faced a choice: delay the launch to manually fix every module, or find a way to make AI-generated code meet MyCoCo's standards automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: OPA Guardrails for AI-Generated Code
&lt;/h2&gt;

&lt;p&gt;MyCoCo's solution wasn't to abandon AI—it was to teach their pipeline what the AI didn't know. The team implemented a three-layer policy enforcement approach using Open Policy Agent (OPA) integrated with Conftest.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Required Tags Policy
&lt;/h3&gt;

&lt;p&gt;The most common AI omission was resource tagging. MyCoCo created an OPA policy that blocks any PR missing required tags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rego"&gt;&lt;code&gt;&lt;span class="c1"&gt;# policy/tags.rego&lt;/span&gt;
&lt;span class="ow"&gt;package&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;

&lt;span class="n"&gt;required_tags&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Environment"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"Owner"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"CostCenter"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;deny&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resource_changes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;actions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"create"&lt;/span&gt;

    &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="n"&gt;missing&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;required_tags&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

    &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;missing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s2"&gt;"%s '%s' missing required tags: %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;missing&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 2: Encryption Enforcement
&lt;/h3&gt;

&lt;p&gt;AI-generated S3 buckets and RDS instances frequently lacked encryption configuration—a SOC 2 requirement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rego"&gt;&lt;code&gt;&lt;span class="c1"&gt;# policy/encryption.rego&lt;/span&gt;
&lt;span class="ow"&gt;package&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;encryption&lt;/span&gt;

&lt;span class="n"&gt;deny&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resource_changes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"aws_s3_bucket"&lt;/span&gt;
    &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;actions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"create"&lt;/span&gt;

    &lt;span class="c1"&gt;# Check for server-side encryption configuration&lt;/span&gt;
    &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;has_encryption_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;address&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s2"&gt;"S3 bucket '%s' must have encryption enabled"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Layer 3: IAM Least Privilege
&lt;/h3&gt;

&lt;p&gt;The most dangerous AI pattern was wildcard IAM permissions. This policy catches overly permissive policies before they reach production:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rego"&gt;&lt;code&gt;&lt;span class="c1"&gt;# policy/iam.rego&lt;/span&gt;
&lt;span class="ow"&gt;package&lt;/span&gt; &lt;span class="n"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iam&lt;/span&gt;

&lt;span class="n"&gt;deny&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resource_changes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_policy"&lt;/span&gt;

    &lt;span class="n"&gt;policy_doc&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unmarshal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;change&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;after&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;statement&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;policy_doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Statement&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;statement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Effect&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
    &lt;span class="n"&gt;statement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"*"&lt;/span&gt;

    &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s2"&gt;"IAM policy '%s' contains wildcard Action - use least privilege"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pipeline Integration
&lt;/h3&gt;

&lt;p&gt;The team integrated these policies into their GitHub Actions workflow, running &lt;code&gt;conftest&lt;/code&gt; against every Terraform plan:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Policy Check&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;terraform plan -out=tfplan&lt;/span&gt;
    &lt;span class="s"&gt;terraform show -json tfplan &amp;gt; tfplan.json&lt;/span&gt;
    &lt;span class="s"&gt;conftest test tfplan.json --policy policy/&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any policy violation blocks the PR merge, with clear error messages explaining exactly what needs to be fixed. Jordan found that AI assistants could often fix the violations when given the specific error message—turning the guardrail into a feedback loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results: MyCoCo's Transformation
&lt;/h2&gt;

&lt;p&gt;Within three weeks of implementing OPA guardrails, MyCoCo's metrics shifted dramatically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Security findings per AI-generated module:&lt;/strong&gt; 47 → 3 (94% reduction)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Development velocity:&lt;/strong&gt; Retained approximately 70% of the original speed gains&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unexpected benefit:&lt;/strong&gt; The guardrails improved manually-written code too—engineers discovered their own modules had tagging gaps&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;"We stopped thinking of AI as a code generator and started thinking of it as a fast first draft. The guardrails aren't a speed bump—they're the quality gate that makes the speed sustainable."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Maya added the policies to MyCoCo's security documentation, creating an "AI-Generated Code Checklist" that new team members review before using coding assistants. The launch proceeded on schedule, with infrastructure that passed SOC 2 audit on the first attempt.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Syntax validity does not equal security compliance.&lt;/strong&gt; AI-generated code that passes &lt;code&gt;terraform validate&lt;/code&gt; may still fail 90%+ of security requirements.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AI lacks organizational context by design.&lt;/strong&gt; Your tagging policies, naming conventions, and security baselines don't exist in training data. Guardrails inject that context automatically.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The confidence gap is dangerous.&lt;/strong&gt; Teams often review AI-generated code less carefully than human-written code, despite it being more likely to have compliance gaps. Invert this assumption.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Guardrails create feedback loops.&lt;/strong&gt; When AI assistants receive specific policy violation messages, they can often self-correct—making the guardrail an accelerator, not just a gate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with the obvious omissions.&lt;/strong&gt; Required tags, encryption, and least-privilege IAM catch the majority of AI blind spots with minimal policy complexity.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AI-generated infrastructure code isn't going away—it's too fast and too useful. But speed without guardrails creates security debt that compounds with every deployment. The solution isn't to abandon AI; it's to inject your organizational context through automated policy enforcement.&lt;/p&gt;

&lt;p&gt;Start with three policies: required tags, encryption enforcement, and IAM least privilege. These catch the majority of AI blind spots and give you a foundation to build on.&lt;/p&gt;

&lt;p&gt;Your infrastructure can be both fast and compliant. You just need the right guardrails.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your experience with AI-generated infrastructure code? Have you implemented guardrails, or are you still reviewing everything manually? Share your lessons learned in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>security</category>
      <category>devops</category>
      <category>ai</category>
    </item>
    <item>
      <title>Terraform Stacks: MyCoCo's Landing Zone Dependencies Done Right</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Sat, 29 Nov 2025 22:23:35 +0000</pubDate>
      <link>https://forem.com/dc-shimla/terraform-stacks-mycocos-landing-zone-dependencies-done-right-2nna</link>
      <guid>https://forem.com/dc-shimla/terraform-stacks-mycocos-landing-zone-dependencies-done-right-2nna</guid>
      <description>&lt;h2&gt;
  
  
  Elevator Pitch
&lt;/h2&gt;

&lt;p&gt;Every growing platform team faces the same architectural challenge: shared infrastructure—networking, security, identity—must evolve independently from the applications that consume it, yet changes to these "landing zones" ripple unpredictably across downstream systems. MyCoCo's platform team discovered this the hard way when a routine networking update triggered a 47-minute production outage. Terraform Stacks, now generally available in HCP Terraform, transforms landing zone management from a coordination nightmare into an automatically-orchestrated dependency graph—making foundational infrastructure changes safe, visible, and automatically propagated to every consuming application.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Problem:&lt;/strong&gt; Landing zone changes—networking, security, IAM baselines—create invisible dependencies that break downstream applications when platform teams update shared infrastructure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Solution:&lt;/strong&gt; Terraform Stacks with Linked Stacks formalize landing zone relationships, automatically triggering downstream plans when foundational infrastructure changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Impact:&lt;/strong&gt; MyCoCo eliminated surprise outages from landing zone updates and reduced cross-team coordination from hours of Slack messages to automatic HCP Terraform notifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Implementation:&lt;/strong&gt; Landing zone Stack publishes outputs via &lt;code&gt;publish_output&lt;/code&gt; blocks; product Stacks consume them via &lt;code&gt;upstream_input&lt;/code&gt; blocks with automatic change propagation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bottom Line:&lt;/strong&gt; If your platform team manages shared infrastructure that application teams consume, Linked Stacks turn that implicit dependency into an explicit, automatically-orchestrated relationship&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fruoymu4gerrvryaccuuz.png" alt="Terraform Stacks: Components for reusable modules, Deployments for environment parity, and Linked Stacks for automatic dependency orchestration" width="800" height="419"&gt;
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The Challenge: When Landing Zone Updates Break Everything
&lt;/h2&gt;

&lt;p&gt;MyCoCo's architecture followed the pattern every scaling platform team adopts: centralized landing zones managed by the platform team, consumed by product teams through Terraform data sources. Five product teams, three environments, two regions—on paper, clean separation of concerns. In practice? Invisible dependencies everywhere.&lt;/p&gt;

&lt;p&gt;The breaking point came on a Friday afternoon. The platform team pushed a routine subnet reorganization. GitHub Actions ran successfully. Tests passed. Three hours later, the Support team deployed an application update referencing security group IDs that had been reorganized. Production was down for 47 minutes.&lt;/p&gt;

&lt;p&gt;"The landing zone is supposed to be the stable foundation," Sam (Senior DevOps Engineer) said during the post-incident review. "Instead, it's a source of surprise outages because we have no way to know which product stacks depend on which outputs—until something breaks."&lt;/p&gt;

&lt;p&gt;Jordan (Platform Engineer) had been tracking Terraform Stacks since the beta. The GA release included Linked Stacks—exactly what MyCoCo needed: landing zone dependencies that are explicit, visible, and automatically coordinated.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: Stacks for Landing Zone Architecture
&lt;/h2&gt;

&lt;p&gt;Terraform Stacks address the landing zone problem through three interconnected concepts: Components for reusable infrastructure definitions, Deployments for environment-specific instantiation, and Linked Stacks for formalizing cross-stack dependencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; Stacks are exclusively an HCP Terraform feature—they're not available in open-source Terraform. With &lt;a href="https://developer.hashicorp.com/terraform/cloud-docs/overview/estimate-hcp-terraform-cost" rel="noopener noreferrer"&gt;RUM-based pricing&lt;/a&gt;, each managed resource counts toward your bill. For teams already paying the coordination tax in engineering hours, Slack threads, and incident response, the trade-off often favors Stacks. But model your costs explicitly before migrating—especially for large landing zones with many downstream consumers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Components: Reusable Landing Zone Modules
&lt;/h3&gt;

&lt;p&gt;Components wrap Terraform modules into reusable building blocks. For landing zones, this means defining networking, security, and identity components once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# landing-zone/components.tfcomponent.hcl&lt;/span&gt;
&lt;span class="nx"&gt;component&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/vpc"&lt;/span&gt;

  &lt;span class="nx"&gt;inputs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environment&lt;/span&gt;
    &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cidr_block&lt;/span&gt;
    &lt;span class="nx"&gt;region&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;region&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;providers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;component&lt;/span&gt; &lt;span class="s2"&gt;"security_baseline"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/security"&lt;/span&gt;

  &lt;span class="nx"&gt;inputs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;vpc_id&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;component&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc_id&lt;/span&gt;
    &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environment&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;providers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Components automatically resolve internal dependencies—&lt;code&gt;security_baseline&lt;/code&gt; waits for &lt;code&gt;vpc&lt;/code&gt; without orchestration scripts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deployments: Environment Parity Without the Drift
&lt;/h3&gt;

&lt;p&gt;Maintaining landing zones across multiple environments traditionally forces an uncomfortable choice: copy-paste your Terraform code and modify it per environment (leading to inevitable drift), or manage complex variable files with conditional logic that becomes increasingly fragile.&lt;/p&gt;

&lt;p&gt;Deployments solve this by letting you define your infrastructure once via Components, then create multiple instantiations with only the inputs changing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# landing-zone/deployments.tfdeploy.hcl&lt;/span&gt;
&lt;span class="nx"&gt;deployment&lt;/span&gt; &lt;span class="s2"&gt;"development"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;inputs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dev"&lt;/span&gt;
    &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.1.0.0/16"&lt;/span&gt;
    &lt;span class="nx"&gt;region&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ca-central-1"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;deployment&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;inputs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"prod"&lt;/span&gt;
    &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.0.0/16"&lt;/span&gt;
    &lt;span class="nx"&gt;region&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ca-central-1"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both deployments run the exact same VPC and security baseline components—just with different inputs. Each deployment maintains its own isolated state file, so changes to one environment never accidentally affect another.&lt;/p&gt;

&lt;p&gt;The practical payoff: when your platform team updates the VPC component logic, that change rolls out consistently to both dev and prod. No more "works in dev, breaks in prod" because the configurations drifted over months of separate maintenance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Linked Stacks: The Landing Zone Game-Changer
&lt;/h3&gt;

&lt;p&gt;The killer feature for platform teams is Linked Stacks. Instead of product teams using fragile data sources to reference landing zone outputs, the relationship becomes explicit and automatically coordinated.&lt;/p&gt;

&lt;p&gt;The landing zone Stack publishes specific outputs for downstream consumption—building on the production deployment we defined above:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# landing-zone/deployments.tfdeploy.hcl (continued)&lt;/span&gt;
&lt;span class="nx"&gt;publish_output&lt;/span&gt; &lt;span class="s2"&gt;"vpc_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Production VPC for application stacks"&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;production&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;publish_output&lt;/span&gt; &lt;span class="s2"&gt;"private_subnet_ids"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Private subnets for application workloads"&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;deployment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;production&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;private_subnet_ids&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Product Stacks declare explicit dependencies on the landing zone using &lt;code&gt;upstream_input&lt;/code&gt; blocks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# product-stack/deployments.tfdeploy.hcl&lt;/span&gt;
&lt;span class="nx"&gt;upstream_input&lt;/span&gt; &lt;span class="s2"&gt;"landing_zone"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"stack"&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app.terraform.io/mycoco/platform/landing-zone"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;deployment&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;inputs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"prod"&lt;/span&gt;
    &lt;span class="nx"&gt;vpc_id&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;upstream_input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;landing_zone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc_id&lt;/span&gt;
    &lt;span class="nx"&gt;subnet_ids&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;upstream_input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;landing_zone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;private_subnet_ids&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The critical benefit: when the platform team updates the landing zone—adding subnets, modifying security groups, changing routing—HCP Terraform automatically triggers plans in every downstream product Stack. No more "who needs to know about this change?" conversations. No more forgotten dependencies causing Friday incidents.&lt;/p&gt;




&lt;h2&gt;
  
  
  Results: Landing Zone Changes Without Fear
&lt;/h2&gt;

&lt;p&gt;Six weeks after migrating the landing zone to Stacks, the transformation was unmistakable.&lt;/p&gt;

&lt;p&gt;Landing zone updates went from anxiety-inducing Slack announcements to routine operations. When networking changes are needed, HCP Terraform automatically queues plans for all five product Stacks—product teams see exactly what upstream changes triggered their plan. The "did anyone change the landing zone today?" messages disappeared entirely.&lt;/p&gt;

&lt;p&gt;Environment drift hasn't occurred since the migration. The same components deploy identically across environments; when the platform team adds a subnet, it appears everywhere simultaneously.&lt;/p&gt;

&lt;p&gt;"Friday deployments used to terrify me," Sam admitted. "Now the dependency graph isn't in my head anymore—it's in the configuration."&lt;/p&gt;

&lt;p&gt;The team retired 400 lines of orchestration scripts and three GitHub Actions workflows. More importantly, they retired the on-call burden of being the human dependency resolver.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Start with your landing zone.&lt;/strong&gt; If your platform team manages shared infrastructure consumed by application teams, migrate the landing zone first. The &lt;code&gt;publish_output&lt;/code&gt; pattern immediately demonstrates value to downstream teams who see automatic plan triggers instead of surprise breakages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Linked Stacks replace tribal knowledge.&lt;/strong&gt; Every data source referencing another team's infrastructure is an implicit dependency waiting to cause an incident. &lt;code&gt;upstream_input&lt;/code&gt; blocks make those relationships explicit, versioned, and automatically coordinated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Automatic propagation changes team dynamics.&lt;/strong&gt; When landing zone changes automatically trigger downstream plans, the platform team stops being a coordination bottleneck and starts being an infrastructure service provider.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model your costs before migrating.&lt;/strong&gt; Stacks are &lt;a href="https://developer.hashicorp.com/terraform/cloud-docs/stacks" rel="noopener noreferrer"&gt;HCP Terraform-only&lt;/a&gt;. Calculate your RUM-based costs against the engineering hours you're currently spending on coordination—for most teams with complex landing zones, the math favors Stacks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migrate incrementally.&lt;/strong&gt; Stacks coexist with traditional workspaces. Migrate your landing zone, connect one product Stack as a proof of concept, then expand. The &lt;code&gt;upstream_input&lt;/code&gt; pattern works across the boundary—new Stacks can consume from existing ones.&lt;/p&gt;

&lt;p&gt;For platform teams drowning in cross-repository dependencies and change coordination, Linked Stacks represent the most significant improvement to landing zone management since Terraform workspaces. The dependency graph you've been maintaining manually? It's finally infrastructure as code.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Ready to formalize your landing zone dependencies? Start by modeling your current architecture as Stacks Components, then connect one downstream consumer with upstream_input blocks.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>stacks</category>
      <category>networking</category>
      <category>devops</category>
    </item>
    <item>
      <title>Why Ephemeral Resources in Terraform Matter: How MyCoCo Eliminated Secrets from State Files</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Fri, 19 Sep 2025 16:48:25 +0000</pubDate>
      <link>https://forem.com/dc-shimla/why-ephemeral-resources-in-terraform-matter-how-mycoco-eliminated-secrets-from-state-files-4b0k</link>
      <guid>https://forem.com/dc-shimla/why-ephemeral-resources-in-terraform-matter-how-mycoco-eliminated-secrets-from-state-files-4b0k</guid>
      <description>&lt;p&gt;When organizations manage cloud infrastructure with code, sensitive passwords and access keys often get saved in files where anyone with access can see them—creating major security risks. Terraform v1.10's "ephemeral" features solve this by using secrets temporarily without saving them permanently. Here's how MyCoCo went from failing security audits to achieving SOC 2 compliance by eliminating all exposed passwords from their infrastructure files.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt; Terraform state files store database passwords, API keys, and certificates in plaintext, creating major security and compliance risks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; Terraform v1.10's ephemeral resources exist only during execution—never written to state or plan files. They use a unique lifecycle: open → use → close.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Impact:&lt;/strong&gt; MyCoCo went from failing security audits due to exposed secrets to achieving SOC 2 compliance by eliminating all sensitive data from state files. Zero secrets in state, improved developer experience, simplified disaster recovery.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Implementation:&lt;/strong&gt; Use ephemeral "aws_secretsmanager_secret_version" to fetch secrets and secret_string_wo (write-only) arguments to store them without state persistence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom Line:&lt;/strong&gt; If you're storing secrets in Terraform state files, ephemeral resources aren't optional—they're essential for enterprise security.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: MyCoCo's Security Wake-Up Call
&lt;/h2&gt;

&lt;p&gt;MyCoCo started like many tech companies—two DevOps engineers managing infrastructure for a growing startup. As they scaled from 5 to 50 engineers, their Terraform usage expanded rapidly. They were proud of their infrastructure-as-code approach, with everything properly versioned and state files safely stored in an encrypted S3 bucket.&lt;/p&gt;

&lt;p&gt;Then came the compliance audit.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Your Terraform state files contain database passwords in plaintext," the security consultant announced during their SOC 2 preparation. "Anyone with access to your state backend can see every database credential, API token, and private key you manage."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The MyCoCo team was stunned. They'd focused on encrypting state files at rest and restricting S3 bucket access, but they'd never considered what was inside those encrypted files. A quick investigation revealed the scope:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Database passwords for PostgreSQL and Redis instances&lt;/li&gt;
&lt;li&gt;JWT signing keys for authentication&lt;/li&gt;
&lt;li&gt;SSL certificate private keys&lt;/li&gt;
&lt;li&gt;Third-party API tokens for monitoring services&lt;/li&gt;
&lt;li&gt;SSH keys for bastion host access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even worse, their disaster recovery procedures included copying state files to a secondary region, and their CI/CD system downloaded state files to runners for each deployment. Every copy contained the same sensitive data.&lt;/p&gt;

&lt;p&gt;Their initial attempts to solve this failed. The &lt;code&gt;sensitive&lt;/code&gt; argument only hides CLI output—data still goes to state files. External secret management tools proved complex to integrate with existing workflows.&lt;/p&gt;

&lt;p&gt;The audit deadline was approaching, and MyCoCo needed a solution that would eliminate secrets from state files without a complete workflow overhaul.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: MyCoCo's Ephemeral Implementation
&lt;/h2&gt;

&lt;p&gt;Terraform v1.10's ephemeral resources solved MyCoCo's problem. These resources exist only during execution—they're opened when needed, used during operations, then closed. Values never touch state or plan files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before: Passwords in State
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# OLD APPROACH - Password stored in state&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"random_password"&lt;/span&gt; &lt;span class="s2"&gt;"db_password"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;length&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
  &lt;span class="nx"&gt;special&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_db_instance"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;random_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;  &lt;span class="c1"&gt;# Goes to state!&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  After: Clean State Files
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# NEW APPROACH - No passwords in state&lt;/span&gt;
&lt;span class="nx"&gt;ephemeral&lt;/span&gt; &lt;span class="s2"&gt;"random_password"&lt;/span&gt; &lt;span class="s2"&gt;"db_password"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;length&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
  &lt;span class="nx"&gt;override_special&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"!#$%&amp;amp;*()-_=+[]{}&amp;lt;&amp;gt;:?"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_secretsmanager_secret_version"&lt;/span&gt; &lt;span class="s2"&gt;"db_password"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;secret_id&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_secretsmanager_secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;secret_string_wo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ephemeral&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;random_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: &lt;code&gt;secret_string_wo&lt;/code&gt; is a write-only argument that accepts ephemeral values but doesn't store them in state.&lt;/p&gt;

&lt;h3&gt;
  
  
  Retrieving Secrets Without Persistence
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;ephemeral&lt;/span&gt; &lt;span class="s2"&gt;"aws_secretsmanager_secret_version"&lt;/span&gt; &lt;span class="s2"&gt;"app_database"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;secret_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_secretsmanager_secret_version&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secret_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;db_credentials&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsondecode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;ephemeral&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_secretsmanager_secret_version&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;app_database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secret_string&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"postgresql"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;host&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_db_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;address&lt;/span&gt;
  &lt;span class="nx"&gt;username&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_credentials&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"username"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_credentials&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"password"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern retrieves secrets during Terraform execution without persisting them in the infrastructure-as-code workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results: MyCoCo's Security Transformation
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Complete Secret Elimination:&lt;/strong&gt; State files went from containing dozens of sensitive values to zero. Security audits could focus on access controls rather than data exposure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simplified Compliance:&lt;/strong&gt; SOC 2 compliance became straightforward when they could demonstrate no secrets persisted in infrastructure artifacts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Improved Operations:&lt;/strong&gt; Disaster recovery, state file backups, and vendor sharing became routine operations without security concerns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enhanced Developer Experience:&lt;/strong&gt; Team members could examine state files for debugging without exposing production secrets.&lt;/p&gt;

&lt;p&gt;The transition required understanding that ephemeral resources can't be referenced in contexts requiring persistence—but this constraint is the security feature, not a limitation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with High-Value Secrets&lt;/strong&gt;: Prioritize database passwords, API keys, and certificates for maximum security impact.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Understand Write-Only Arguments&lt;/strong&gt;: These accept ephemeral values without storing them in state—the key to clean implementation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Plan for Provider Differences&lt;/strong&gt;: AWS, Azure, and GCP implement ephemeral resources differently based on their native secret management patterns.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Validate Implementation&lt;/strong&gt;: Use &lt;code&gt;terraform show&lt;/code&gt; to verify sensitive values no longer appear in state files.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Embrace the Constraint&lt;/strong&gt;: Ephemeral resources can't persist data—this limitation is the security benefit.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;For teams still storing secrets in state files, ephemeral resources represent a fundamental shift toward security-by-design in infrastructure-as-code. The investment in learning this approach pays dividends in compliance, security, and operational confidence.&lt;/p&gt;

&lt;p&gt;The key is understanding that ephemeral resources solve a fundamental problem: how to use sensitive data in infrastructure code without creating persistent security risks. They're not just a feature—they're a security paradigm that should be standard practice for any organization handling sensitive infrastructure.&lt;/p&gt;

&lt;p&gt;Ready to eliminate secrets from your state files? Start by identifying your highest-risk secrets and implementing ephemeral resources with Terraform v1.10.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your experience with secrets in Terraform state files? Have you implemented ephemeral resources or found other approaches to this security challenge? Share your strategies and lessons learned in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>security</category>
      <category>infrastructureascode</category>
      <category>secretsmanagement</category>
    </item>
    <item>
      <title>Comprehensive Terraform State Security: MyCoCo's Journey from Public Exposure to Layered Protection</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Fri, 19 Sep 2025 16:44:40 +0000</pubDate>
      <link>https://forem.com/dc-shimla/comprehensive-terraform-state-security-mycocos-journey-from-public-exposure-to-layered-protection-7kg</link>
      <guid>https://forem.com/dc-shimla/comprehensive-terraform-state-security-mycocos-journey-from-public-exposure-to-layered-protection-7kg</guid>
      <description>&lt;p&gt;When organizations store Terraform state files, they're essentially creating blueprints of their entire infrastructure that hackers would love to access. Even encrypted storage isn't enough if the data inside reveals architectural vulnerabilities, database locations, and system dependencies. Here's how MyCoCo transformed a three-hour public exposure incident into enterprise-grade state security that passed rigorous compliance audits.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt; Terraform state files expose infrastructure architecture, resource relationships, and system dependencies even without credential leaks—creating reconnaissance goldmines for attackers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; Layered state security combining remote backends, encryption at rest/transit, access controls, and audit logging across multiple protection levels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Impact:&lt;/strong&gt; MyCoCo went from accidental public exposure to enterprise-grade state security, achieving SOC 2 compliance and eliminating architectural intelligence gathering risks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Implementation:&lt;/strong&gt; S3 backend with KMS encryption, IAM policies with least privilege, and comprehensive audit trails.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom Line:&lt;/strong&gt; If your state files aren't comprehensively protected, you're giving attackers detailed blueprints of your infrastructure—encryption alone isn't enough.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: MyCoCo's Security Wake-Up Call
&lt;/h2&gt;

&lt;p&gt;During MyCoCo's SOC 2 preparation, Maya's security audit revealed multiple state file vulnerabilities beyond the secrets exposure issue. The immediate crisis came during a routine infrastructure update when Sam accidentally applied a Terraform configuration that modified their S3 bucket policy, making their state storage publicly readable for three hours before the monitoring system caught the misconfiguration.&lt;/p&gt;

&lt;p&gt;Maya was immediately alerted to the bucket exposure. While investigating the incident, she discovered something that kept her awake that night: even without any credential leaks, the exposed state files revealed a detailed map of MyCoCo's entire infrastructure architecture.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Look at this," Maya told the team during the post-incident review, pulling up the state file contents. "Someone could see our database instance types, VPC configurations, load balancer setups, and even our disaster recovery patterns. They'd know exactly how our systems connect and where our critical components live."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Alex, their VP of Engineering, realized the scope:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This isn't just about encryption. Someone could use this architectural information to plan attacks, understand our scaling patterns, and identify potential weak points in our infrastructure."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The incident triggered an emergency security review that encompassed both state file access controls and the secrets exposure issues Maya had identified. While no credentials were compromised, the architectural exposure violated several SOC 2 requirements and could have enabled sophisticated attacks. MyCoCo needed comprehensive state security that protected both sensitive data and infrastructure intelligence.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: MyCoCo's Layered State Security Implementation
&lt;/h2&gt;

&lt;p&gt;Maya designed a comprehensive state security strategy addressing multiple threat vectors beyond basic encryption. The approach required protecting state files from unauthorized access while maintaining team productivity and compliance requirements.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Secure Remote Backend Modernization
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Enhanced S3 backend configuration with native state locking&lt;/span&gt;
&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&amp;gt;= 1.12"&lt;/span&gt;  &lt;span class="c1"&gt;# Latest stable version&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/aws"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 6.0"&lt;/span&gt;  &lt;span class="c1"&gt;# Latest major version&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="s2"&gt;"s3"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;bucket&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"mycoco-terraform-state-prod"&lt;/span&gt;
    &lt;span class="nx"&gt;key&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"infrastructure/terraform.tfstate"&lt;/span&gt;
    &lt;span class="nx"&gt;region&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ca-central-1"&lt;/span&gt;
    &lt;span class="nx"&gt;encrypt&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="nx"&gt;kms_key_id&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:kms:ca-central-1:123456789:key/terraform-state"&lt;/span&gt;
    &lt;span class="nx"&gt;use_lockfile&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

    &lt;span class="c1"&gt;# Critical security configurations&lt;/span&gt;
    &lt;span class="nx"&gt;skip_region_validation&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="nx"&gt;skip_credentials_validation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="nx"&gt;skip_metadata_api_check&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MyCoCo's existing S3 remote state setup had critical security gaps and used outdated DynamoDB locking. Layer 1 modernized their backend configuration with S3 native locking using &lt;code&gt;use_lockfile = true&lt;/code&gt;, eliminating their DynamoDB dependency while adding proper security configurations. This foundational update secured their existing remote state infrastructure and established the baseline for all subsequent security controls.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Access Control and Authentication
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# IAM policy for Terraform state access with S3 native locking support&lt;/span&gt;
&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_policy_document"&lt;/span&gt; &lt;span class="s2"&gt;"terraform_state_access"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;effect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
    &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="s2"&gt;"s3:GetObject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;"s3:PutObject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;"s3:DeleteObject"&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="s2"&gt;"${aws_s3_bucket.terraform_state.arn}/*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;"${aws_s3_bucket.terraform_state.arn}/*.tflock"&lt;/span&gt;  &lt;span class="c1"&gt;# Lock file access&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="nx"&gt;condition&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;test&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"StringEquals"&lt;/span&gt;
      &lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"s3:x-amz-server-side-encryption"&lt;/span&gt;
      &lt;span class="nx"&gt;values&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"aws:kms"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;condition&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;test&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"StringEquals"&lt;/span&gt;
      &lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"s3:x-amz-server-side-encryption-aws-kms-key-id"&lt;/span&gt;
      &lt;span class="nx"&gt;values&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"${aws_kms_key.terraform_state.arn}"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;effect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
    &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"s3:ListBucket"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_s3_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terraform_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;effect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
    &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="s2"&gt;"kms:Decrypt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;"kms:GenerateDataKey"&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_kms_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;terraform_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Layer 2 established granular control over who could access state files and under what conditions. The foundation began with defining precise IAM permissions that specified exactly which actions were allowed on state files and lock files, including the encryption requirements that must be met for any access attempts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: Monitoring and Audit Trail
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Configure AWS provider&lt;/span&gt;
&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"aws"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ca-central-1"&lt;/span&gt;  &lt;span class="c1"&gt;# Primary region for Toronto-based MyCoCo&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# CloudTrail for state access monitoring&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_cloudtrail"&lt;/span&gt; &lt;span class="s2"&gt;"terraform_state_audit"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-state-access"&lt;/span&gt;
  &lt;span class="nx"&gt;s3_bucket_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_s3_bucket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;audit_logs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bucket&lt;/span&gt;
  &lt;span class="nx"&gt;s3_key_prefix&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-state-audit"&lt;/span&gt;

  &lt;span class="nx"&gt;event_selector&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;read_write_type&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"All"&lt;/span&gt;
    &lt;span class="nx"&gt;include_management_events&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

    &lt;span class="nx"&gt;data_resource&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;type&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AWS::S3::Object"&lt;/span&gt;
      &lt;span class="nx"&gt;values&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"${aws_s3_bucket.terraform_state.arn}/*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Real-time monitoring&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_cloudwatch_metric_alarm"&lt;/span&gt; &lt;span class="s2"&gt;"state_access_errors"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;alarm_name&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-state-access-errors"&lt;/span&gt;
  &lt;span class="nx"&gt;comparison_operator&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GreaterThanThreshold"&lt;/span&gt;
  &lt;span class="nx"&gt;evaluation_periods&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="nx"&gt;metric_name&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ErrorCount"&lt;/span&gt;
  &lt;span class="nx"&gt;namespace&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AWS/CloudTrail"&lt;/span&gt;
  &lt;span class="nx"&gt;period&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;
  &lt;span class="nx"&gt;statistic&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Sum"&lt;/span&gt;
  &lt;span class="nx"&gt;threshold&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="nx"&gt;alarm_description&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Alert on S3 access errors for Terraform state files"&lt;/span&gt;
  &lt;span class="nx"&gt;alarm_actions&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;state_security_alerts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Layer 3 provided comprehensive visibility into all state file access activities, enabling MyCoCo to detect security incidents, demonstrate compliance, and respond to suspicious activities. The code references existing infrastructure components (S3 buckets, KMS keys, SNS topics) that would be part of MyCoCo's broader security architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results: MyCoCo's Security Transformation
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Complete Architectural Protection:&lt;/strong&gt; State files became inaccessible to unauthorized users, eliminating reconnaissance risks and architectural intelligence gathering. Security audits could focus on access controls rather than exposure prevention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enterprise Compliance Achievement:&lt;/strong&gt; SOC 2 Type II certification became straightforward when auditors could verify comprehensive state security controls, encryption standards, and access audit trails.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Operational Excellence:&lt;/strong&gt; The security improvements enhanced rather than hindered operations. Automated monitoring provided early warning of issues, granular IAM policies provided appropriate access controls, and comprehensive audit trails simplified incident response and compliance reporting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Zero Security Incidents:&lt;/strong&gt; The comprehensive approach eliminated state-related security concerns. The team could focus on infrastructure innovation rather than security firefighting, and executive stakeholders gained confidence in MyCoCo's infrastructure governance.&lt;/p&gt;

&lt;p&gt;The transition required understanding that state security extends beyond encryption to include access control, monitoring, and compliance integration—but these comprehensive protections became competitive advantages rather than operational overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with Remote Backends&lt;/strong&gt;: Local state files create uncontrollable security risks that grow with team size and infrastructure complexity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Layer Security Controls&lt;/strong&gt;: Encryption alone isn't sufficient—combine with access controls, monitoring, and audit trails for comprehensive protection.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consider Multi-Region Deployments&lt;/strong&gt;: For enhanced disaster recovery, consider deploying monitoring infrastructure across multiple regions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Plan for Compliance&lt;/strong&gt;: Enterprise customers increasingly require demonstrable state security controls as part of vendor security assessments.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monitor Continuously&lt;/strong&gt;: Automated monitoring and alerting enable rapid response to misconfigurations and unauthorized access attempts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Implement Comprehensive Logging&lt;/strong&gt;: Detailed audit trails and access monitoring provide the foundation for both security operations and compliance requirements.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;For teams still using local state or basic remote backends, comprehensive state security represents a fundamental shift toward enterprise-ready infrastructure governance. The investment in proper state security pays dividends in compliance, security confidence, and operational reliability.&lt;/p&gt;

&lt;p&gt;The key insight is that Terraform state files are more than just operational data—they're detailed architectural blueprints that must be protected with the same rigor as your production databases. Start with encrypted remote backends, then build layered access controls and monitoring around your infrastructure state.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your approach to Terraform state security? Have you dealt with state file exposure incidents or compliance requirements? Share your security strategies and lessons learned in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>security</category>
      <category>aws</category>
      <category>compliance</category>
    </item>
    <item>
      <title>Terraform Modules and Workspaces: How MyCoCo Scaled from Copy-Paste to DRY Infrastructure</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Fri, 19 Sep 2025 16:42:24 +0000</pubDate>
      <link>https://forem.com/dc-shimla/terraform-modules-and-workspaces-how-mycoco-scaled-from-copy-paste-to-dry-infrastructure-15cn</link>
      <guid>https://forem.com/dc-shimla/terraform-modules-and-workspaces-how-mycoco-scaled-from-copy-paste-to-dry-infrastructure-15cn</guid>
      <description>&lt;p&gt;When teams manage multiple environments in Terraform, they often duplicate entire configurations, creating a maintenance nightmare. Terraform workspaces combined with modules provide an elegant solution—one codebase serves all environments through workspace-aware configurations. Here's how MyCoCo transformed their infrastructure management using workspaces to achieve true DRY (Don't Repeat Yourself) principles.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt; Copy-pasted Terraform configurations across dev/staging/production led to environment drift, deployment failures, and 20+ hours weekly maintenance overhead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; Terraform workspaces for environment separation combined with modules for code reuse and workspace-specific variable files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Impact:&lt;/strong&gt; MyCoCo reduced configuration management time by 70%, eliminated environment drift, and deployed new environments in hours instead of days.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Implementation:&lt;/strong&gt; Use terraform.workspace with modular design patterns and structured terraform.workspace.tfvars files for environment-specific configurations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom Line:&lt;/strong&gt; If you're maintaining separate Terraform directories per environment, workspaces + modules will transform your infrastructure management.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: MyCoCo's Environment Sprawl
&lt;/h2&gt;

&lt;p&gt;"Why do we have three different versions of our RDS configuration?" Jordan asked during his first week as MyCoCo's platform engineer. The answer revealed a deeper problem.&lt;/p&gt;

&lt;p&gt;Sam, the senior DevOps engineer, explained their evolution:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We started with one environment, then copied everything for staging, then copied again for production. Now we maintain three separate Terraform directories."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The structure told the story:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;infrastructure/
├── dev/
│   ├── main.tf      # 500 lines
│   ├── rds.tf       # 200 lines
│   └── variables.tf # 150 lines
├── staging/
│   ├── main.tf      # 497 lines (slightly different)
│   ├── rds.tf       # 198 lines (minor variations)
│   └── variables.tf # 148 lines (different values)
└── production/
    ├── main.tf      # 502 lines (more differences)
    ├── rds.tf       # 205 lines (critical variations)
    └── variables.tf # 152 lines (production values)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alex, VP of Engineering, quantified the impact:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Every infrastructure change requires three pull requests, three code reviews, and three separate terraform apply commands. We're burning 20 hours per week just keeping these in sync."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The breaking point came when a critical security patch needed to be applied across all environments. What should have been a 30-minute task turned into a two-day project of carefully updating each environment, testing separately, and hoping nothing was missed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Workspaces + Modules Architecture
&lt;/h2&gt;

&lt;p&gt;MyCoCo's transformation centered on two key Terraform features: workspaces for environment isolation and modules for code reuse.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before: Duplicate Directories
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# production/rds.tf - Copied and modified for each environment&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"random_password"&lt;/span&gt; &lt;span class="s2"&gt;"db_password"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;length&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
  &lt;span class="nx"&gt;special&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Note: Consider using password_wo instead of password in Terraform 1.11+&lt;/span&gt;
&lt;span class="c1"&gt;# for better security (write-only attribute that doesn't show in plans)&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_db_instance"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;identifier&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"mycoco-prod"&lt;/span&gt;
  &lt;span class="nx"&gt;engine&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"postgres"&lt;/span&gt;
  &lt;span class="nx"&gt;engine_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"15.3"&lt;/span&gt;
  &lt;span class="nx"&gt;instance_class&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"db.t3.large"&lt;/span&gt;  &lt;span class="c1"&gt;# Hard-coded per environment&lt;/span&gt;

  &lt;span class="nx"&gt;allocated_storage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
  &lt;span class="nx"&gt;storage_encrypted&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="nx"&gt;username&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dbadmin"&lt;/span&gt;
  &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;random_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;

  &lt;span class="c1"&gt;# 50+ more lines of mostly identical configuration...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  After: Workspace-Aware Module
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# main.tf - Single configuration for all environments&lt;/span&gt;
&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;workspace_config&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;dev&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;instance_class&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"db.t3.micro"&lt;/span&gt;
      &lt;span class="nx"&gt;allocated_storage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
      &lt;span class="nx"&gt;backup_retention&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;staging&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;instance_class&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"db.t3.medium"&lt;/span&gt;
      &lt;span class="nx"&gt;allocated_storage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
      &lt;span class="nx"&gt;backup_retention&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;production&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;instance_class&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"db.t3.large"&lt;/span&gt;
      &lt;span class="nx"&gt;allocated_storage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
      &lt;span class="nx"&gt;backup_retention&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;env_config&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;workspace_config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;workspace&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"database"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/rds"&lt;/span&gt;

  &lt;span class="nx"&gt;environment&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;terraform&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;workspace&lt;/span&gt;
  &lt;span class="nx"&gt;instance_class&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env_config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;instance_class&lt;/span&gt;
  &lt;span class="nx"&gt;allocated_storage&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env_config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;allocated_storage&lt;/span&gt;
  &lt;span class="nx"&gt;backup_retention&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env_config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;backup_retention&lt;/span&gt;

  &lt;span class="c1"&gt;# Common configuration for all environments&lt;/span&gt;
  &lt;span class="nx"&gt;engine&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"postgres"&lt;/span&gt;
  &lt;span class="nx"&gt;engine_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"15.3"&lt;/span&gt;

  &lt;span class="c1"&gt;# Workspace-aware naming&lt;/span&gt;
  &lt;span class="nx"&gt;identifier&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"mycoco-${terraform.workspace}"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Module Structure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# modules/rds/variables.tf&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"identifier"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The name of the RDS instance"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"engine"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The database engine"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"engine_version"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The engine version to use"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"instance_class"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The instance type of the RDS instance"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"allocated_storage"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The allocated storage in gibibytes"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;number&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"backup_retention"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The days to retain backups for"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;number&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"environment"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The deployment environment"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# modules/rds/main.tf&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"random_password"&lt;/span&gt; &lt;span class="s2"&gt;"db_password"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;length&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
  &lt;span class="nx"&gt;special&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_db_instance"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;identifier&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;identifier&lt;/span&gt;
  &lt;span class="nx"&gt;engine&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;engine&lt;/span&gt;
  &lt;span class="nx"&gt;engine_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;engine_version&lt;/span&gt;
  &lt;span class="nx"&gt;instance_class&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;instance_class&lt;/span&gt;

  &lt;span class="nx"&gt;allocated_storage&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;allocated_storage&lt;/span&gt;
  &lt;span class="nx"&gt;max_allocated_storage&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;allocated_storage&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nx"&gt;storage_encrypted&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="nx"&gt;db_name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"mycoco"&lt;/span&gt;
  &lt;span class="nx"&gt;username&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dbadmin"&lt;/span&gt;
  &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;random_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_password&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;

  &lt;span class="nx"&gt;backup_retention_period&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;backup_retention&lt;/span&gt;
  &lt;span class="nx"&gt;backup_window&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"03:00-04:00"&lt;/span&gt;

  &lt;span class="c1"&gt;# Production-only features&lt;/span&gt;
  &lt;span class="nx"&gt;deletion_protection&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt; &lt;span class="err"&gt;?&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;skip_final_snapshot&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="err"&gt;!&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt; &lt;span class="err"&gt;?&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environment&lt;/span&gt;
    &lt;span class="nx"&gt;ManagedBy&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# modules/rds/outputs.tf&lt;/span&gt;
&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"endpoint"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The connection endpoint"&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_db_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"database_name"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The database name"&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_db_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_name&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"instance_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The RDS instance ID"&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_db_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Environment-Specific Variables
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# envs/production.tfvars&lt;/span&gt;
&lt;span class="nx"&gt;region&lt;/span&gt;               &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ca-central-1"&lt;/span&gt;
&lt;span class="nx"&gt;enable_monitoring&lt;/span&gt;    &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;enable_backups&lt;/span&gt;       &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;multi_az&lt;/span&gt;            &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;performance_insights&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="c1"&gt;# envs/staging.tfvars&lt;/span&gt;
&lt;span class="nx"&gt;region&lt;/span&gt;               &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ca-central-1"&lt;/span&gt;
&lt;span class="nx"&gt;enable_monitoring&lt;/span&gt;    &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;enable_backups&lt;/span&gt;       &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;multi_az&lt;/span&gt;            &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="nx"&gt;performance_insights&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

&lt;span class="c1"&gt;# envs/dev.tfvars&lt;/span&gt;
&lt;span class="nx"&gt;region&lt;/span&gt;               &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ca-central-1"&lt;/span&gt;
&lt;span class="nx"&gt;enable_monitoring&lt;/span&gt;    &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="nx"&gt;enable_backups&lt;/span&gt;       &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="nx"&gt;multi_az&lt;/span&gt;            &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="nx"&gt;performance_insights&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Workspace Management Workflow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# List available workspaces&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;terraform workspace list
  default
  dev
&lt;span class="k"&gt;*&lt;/span&gt; staging
  production

&lt;span class="c"&gt;# Switch to production&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;terraform workspace &lt;span class="k"&gt;select &lt;/span&gt;production

&lt;span class="c"&gt;# Deploy with workspace-specific config&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;terraform apply &lt;span class="nt"&gt;-var-file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"envs/&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;terraform workspace show&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;.tfvars"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Results: MyCoCo's Workspace Transformation
&lt;/h2&gt;

&lt;p&gt;The impact of adopting workspaces with modules was immediate:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;70% Reduction in Maintenance Time:&lt;/strong&gt; Changes that previously required updating three separate codebases now happened in one place. Sam noted:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I make one change, test it in dev workspace, promote through staging, and apply to production—all with the same code."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Zero Environment Drift:&lt;/strong&gt; With all environments using the same modules, configuration drift became impossible. The only differences were intentional, defined in workspace-specific variables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rapid Environment Provisioning:&lt;/strong&gt; When MyCoCo needed a disaster recovery environment, Jordan created it in two hours: &lt;code&gt;terraform workspace new dr-east &amp;amp;&amp;amp; terraform apply&lt;/code&gt;. Previously, this would have taken days of copying and modifying configurations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Clear Environment Isolation:&lt;/strong&gt; Maya, the security engineer, appreciated the built-in separation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Each workspace has its own state file. There's no risk of accidentally modifying production while working in dev."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The real validation came during their next security audit. Instead of reviewing three sets of configurations, auditors examined one modular codebase with clear environment separation through workspaces.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with Workspaces&lt;/strong&gt;: Before diving into complex module structures, implement workspaces to separate your environments while using the same code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use terraform.workspace&lt;/strong&gt;: Reference the current workspace in your configurations for dynamic resource naming and conditional logic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Combine with Modules&lt;/strong&gt;: Workspaces handle environment separation; modules handle code reuse. Together, they eliminate duplication.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Workspace-Specific Variables&lt;/strong&gt;: Store environment-specific values in separate tfvars files, loaded based on the current workspace.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;State Isolation&lt;/strong&gt;: Each workspace maintains its own state file, providing natural environment isolation without complex backend configurations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Documentation is Critical&lt;/strong&gt;: Each module needs clear documentation of required and optional variables, making adoption straightforward. Leverage terraform-docs to automate this process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consider Community Modules&lt;/strong&gt;: For complex RDS setups, consider using established modules like terraform-aws-modules/rds which provides workspace-aware configurations out of the box.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;For teams drowning in duplicate Terraform configurations, workspaces combined with modules offer a path to sanity. The investment in refactoring pays immediate dividends in reduced maintenance, eliminated drift, and happier engineers.&lt;/p&gt;

&lt;p&gt;The key is starting simple: implement workspaces first, then gradually extract common patterns into reusable modules. Don't try to build the perfect module structure upfront—let your actual usage patterns guide the abstraction.&lt;/p&gt;

&lt;p&gt;Ready to transform your multi-environment Terraform? Start with workspaces, add modules, and watch your infrastructure management complexity disappear.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your experience with Terraform workspaces and modules? Have you struggled with managing multiple environments or found better approaches? Share your infrastructure-as-code wins and challenges in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>infrastructureascode</category>
      <category>devops</category>
      <category>modules</category>
    </item>
    <item>
      <title>Why Platform Teams Are Burning Out: How MyCoCo Escaped the Developer Experience Trap</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Fri, 19 Sep 2025 16:38:15 +0000</pubDate>
      <link>https://forem.com/dc-shimla/why-platform-teams-are-burning-out-how-mycoco-escaped-the-developer-experience-trap-59b0</link>
      <guid>https://forem.com/dc-shimla/why-platform-teams-are-burning-out-how-mycoco-escaped-the-developer-experience-trap-59b0</guid>
      <description>&lt;p&gt;Platform engineering promised to solve developer productivity challenges by creating self-service infrastructure. But at many organizations, platform teams are drowning in complexity they've absorbed from developers. MyCoCo discovered their platform team became the new bottleneck—working 60-hour weeks while being blamed for slowing everyone down. Here's how they transformed platform engineering from a burnout factory into a sustainable practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt; Platform teams absorb all infrastructure complexity while becoming the scapegoat for deployment delays.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; Golden paths, Team Topologies, clear SLAs, and project inception frameworks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Impact:&lt;/strong&gt; MyCoCo reduced platform team stress by 50% and increased deployment velocity 3x.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Implementation:&lt;/strong&gt; Pre-approved Terraform patterns covering 80% of use cases with architectural education.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom Line:&lt;/strong&gt; Stop trying to be everything to everyone—constrain flexibility to preserve sanity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: MyCoCo's Platform Team Meltdown
&lt;/h2&gt;

&lt;p&gt;Jordan had been leading MyCoCo's platform engineering initiative for nine months when they finally admitted defeat:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We're not building a platform," they told Alex during their one-on-one. "We're running an IT helpdesk that happens to use Terraform."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The numbers told the story. MyCoCo's platform team of three engineers supported 50 developers across five product teams. They'd built what seemed like a modern setup: GitHub Actions for CI/CD, an LLM-based code reviewer for infrastructure PRs, and standardized Terraform modules for AWS deployments. Yet the team was drowning.&lt;/p&gt;

&lt;p&gt;Every infrastructure PR became a potential disaster. One developer's "simple fix" to update a security group rule would have deleted and recreated the entire security group, causing a 20-minute production outage. Another's "minor IAM adjustment" affected 15 other services. The AI code reviewer caught syntax errors but couldn't understand business context or blast radius.&lt;/p&gt;

&lt;p&gt;Worse, developers constantly bypassed the intake process. Jordan's Slack was a stream of "URGENT: Need this deployed NOW!" messages. Sprint planning became fiction—the team spent every day firefighting requests from DMs, hallway conversations, and executive escalations.&lt;/p&gt;

&lt;p&gt;The breaking point came when a developer complained:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I can deploy a Lambda in 5 minutes on my personal AWS account. Why does it take your team 3 days?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They didn't see that their "simple" Lambda required updating three Terraform modules, modifying VPC configurations, creating new IAM patterns, updating monitoring dashboards, and committing to support this variation forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: From Chaos to Sustainable Platform Engineering
&lt;/h2&gt;

&lt;p&gt;MyCoCo's transformation started with a hard truth: &lt;strong&gt;unlimited flexibility means unlimited support burden&lt;/strong&gt;. Jordan's team implemented five key changes:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Golden Paths Over Infinite Options
&lt;/h3&gt;

&lt;p&gt;Instead of supporting every possible infrastructure pattern, MyCoCo created pre-approved Terraform modules for common use cases. These "golden paths" covered 80% of developer needs while encoding all enterprise requirements automatically.&lt;/p&gt;

&lt;p&gt;Rather than letting developers architect networking and compute from scratch, MyCoCo offered three standardized deployment patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Standard VPC and Networking&lt;/strong&gt;: Multi-AZ setup with public/private subnets, NAT gateways, VPC endpoints, and enterprise-compliant security groups&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard Linux EC2&lt;/strong&gt;: Application-ready instances with SSM access, CloudWatch agent, security patching, backup policies, and proper IAM roles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard ECS Serverless&lt;/strong&gt;: Fargate deployments with ALB integration, service discovery, auto-scaling policies, and centralized logging&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each golden path included cost allocation tags, security scanning, compliance controls, and monitoring—all pre-configured. Developers got consistent, secure infrastructure while the platform team eliminated architectural debates and one-off configurations entirely.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Team Topologies for Clear Boundaries
&lt;/h3&gt;

&lt;p&gt;Following Team Topologies principles, MyCoCo established clear interaction modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;X-as-a-Service&lt;/strong&gt;: Golden path modules with no customization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collaboration&lt;/strong&gt;: Scheduled pairing for complex requirements
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Facilitation&lt;/strong&gt;: Office hours for questions and guidance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No more random Slack interruptions. Developers knew exactly how and when to engage the platform team.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Transparent Service Level Expectations
&lt;/h3&gt;

&lt;p&gt;Published SLAs set clear expectations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Standard patterns&lt;/strong&gt; (using golden paths): 2 business days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom configurations&lt;/strong&gt;: 5 business days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New patterns&lt;/strong&gt; requiring platform changes: 10 business days&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;"Urgent" requests dropped 80% once teams realized proper planning yielded faster results.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Project Inception Requirements
&lt;/h3&gt;

&lt;p&gt;Borrowing from agile methodologies, any project requiring infrastructure now needed a kick-off with platform team involvement. This revealed infrastructure needs early, preventing mid-sprint surprises.&lt;/p&gt;

&lt;p&gt;A simple template captured requirements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What are you building?&lt;/li&gt;
&lt;li&gt;Which golden paths fit your needs?&lt;/li&gt;
&lt;li&gt;What's unique about your requirements?&lt;/li&gt;
&lt;li&gt;When do you need infrastructure ready?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Infrastructure Architecture Reviews
&lt;/h3&gt;

&lt;p&gt;The platform team instituted monthly "Infrastructure Deep Dives" where they walked through real production incidents and changes. Instead of abstract documentation, they showed actual scenarios:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Case Study Session Example:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here's last week's 'simple' Lambda request. Let me show you what actually happened behind the scenes..."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Step 1&lt;/strong&gt;: New subnet allocation in three VPCs (dev, staging, prod)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step 2&lt;/strong&gt;: IAM role creation with 14 policy attachments for scoped permissions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step 3&lt;/strong&gt;: Updating shared security groups across 6 availability zones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step 4&lt;/strong&gt;: Modifying 3 Terraform modules used by 20 other services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step 5&lt;/strong&gt;: Cost allocation tag updates affecting monthly billing reports&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Developers finally understood why their personal AWS experience didn't translate. One developer commented:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I had no idea changing one Lambda meant touching 30 other resources. Now I get why you push golden paths so hard."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These sessions became so popular that developers started attending voluntarily, leading to better infrastructure proposals and fewer "urgent" requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results: MyCoCo's Platform Renaissance
&lt;/h2&gt;

&lt;p&gt;The transformation took three months, but the results were dramatic:&lt;/p&gt;

&lt;h3&gt;
  
  
  Quantitative Improvements
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Platform team overtime dropped from &lt;strong&gt;60 to 40 hours weekly&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Standard deployments accelerated from &lt;strong&gt;3 days to 4 hours&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Urgent requests decreased by &lt;strong&gt;80%&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Developer satisfaction scores increased &lt;strong&gt;35%&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Qualitative Changes
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Platform engineers felt valued instead of blamed&lt;/li&gt;
&lt;li&gt;Developers understood infrastructure complexity better&lt;/li&gt;
&lt;li&gt;Executive pressure shifted from "work faster" to "work smarter"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most importantly, the platform team could finally focus on innovation instead of firefighting. They had time to improve golden paths, automate more processes, and even tackle technical debt from the early "build everything custom" days.&lt;/p&gt;

&lt;p&gt;Sam, who'd been skeptical of platform engineering after years of manual DevOps work, admitted:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I finally have time to think about improvements instead of just surviving each day."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;Platform engineering isn't about building a portal or implementing the latest tools. It's about sustainable practices that balance developer autonomy with operational reality.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with Constraints&lt;/strong&gt;: Less flexibility means less support burden. Your platform team can't be everything to everyone.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Invest in Education&lt;/strong&gt;: Developers aren't trying to make your life difficult—they genuinely don't understand enterprise infrastructure complexity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Set Clear Boundaries&lt;/strong&gt;: Clear interaction patterns and SLAs prevent platform teams from becoming an always-on helpdesk.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Measure Team Health&lt;/strong&gt;: Track platform team health alongside developer productivity metrics.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Focus on the 80%&lt;/strong&gt;: Golden paths that handle most use cases are more valuable than infinite customization options.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Platform engineering success isn't measured by how many features you can build or how much customization you can support. It's measured by whether your team can sustainably deliver value without burning out.&lt;/p&gt;

&lt;p&gt;The goal isn't to make developers happy in the short term—it's to create a system that works for everyone in the long term. Sometimes that means saying no to custom requests and guiding teams toward proven patterns instead.&lt;/p&gt;

&lt;p&gt;Ready to rescue your platform team from burnout? Start by auditing how many infrastructure patterns you're supporting and ask: could 80% of these use a golden path instead?&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your experience with platform team burnout? Have you successfully implemented golden paths or struggled with similar challenges? Share your platform engineering wins and failures in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>platformengineering</category>
      <category>devops</category>
      <category>devrel</category>
      <category>burnout</category>
    </item>
    <item>
      <title>How MyCoCo Eliminated $180K in Zombie Infrastructure: The Framework That Transformed Their Resource Lifecycle Management</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Fri, 19 Sep 2025 16:36:48 +0000</pubDate>
      <link>https://forem.com/dc-shimla/how-mycoco-eliminated-180k-in-zombie-infrastructure-the-framework-that-transformed-their-resource-5bg5</link>
      <guid>https://forem.com/dc-shimla/how-mycoco-eliminated-180k-in-zombie-infrastructure-the-framework-that-transformed-their-resource-5bg5</guid>
      <description>&lt;p&gt;Growing technology companies often accumulate "zombie infrastructure"—resources that were provisioned for specific projects or experiments but never properly decommissioned when no longer needed. Without clear processes for tracking resource ownership and lifecycle, DevOps teams become reactive firefighters while costs spiral upward. MyCoCo's systematic approach to infrastructure lifecycle management eliminated $180,000 in annual waste while establishing sustainable cross-team collaboration that scales with business growth.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt; MyCoCo had no systematic process for managing infrastructure lifecycle, leading to abandoned resources, unclear ownership, and reactive cost management as the company scaled from startup to enterprise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; Implemented RACI accountability framework with weekly operational reviews, monthly strategic planning, and structured decommissioning processes coordinated between DevOps, Product, and Finance teams.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Impact:&lt;/strong&gt; MyCoCo eliminated $180K annually in unused resources, reduced infrastructure decision-making time by 60%, and transformed reactive operations into proactive lifecycle management.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Implementation:&lt;/strong&gt; RACI matrices for resource ownership, comprehensive provider-level tagging with team-based identifiers, and automated discovery using Cloud Custodian policies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom Line:&lt;/strong&gt; Systematic lifecycle management prevents infrastructure sprawl while enabling sustainable scaling—essential for companies transitioning from startup to enterprise operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: MyCoCo's Infrastructure Sprawl Crisis
&lt;/h2&gt;

&lt;p&gt;By late 2024, MyCoCo had reached 60+ employees across five product lines. What started as a simple project management tool had evolved into a comprehensive SaaS platform serving Fortune 500 customers across healthcare, finance, and retail verticals. But success brought unexpected challenges.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Every week, I'm getting requests for new environments, new integrations, new regions," Sam reflected during a team retrospective. "But when was the last time anyone asked me to shut something down?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The numbers told the story. Jordan's quarterly cost analysis revealed concerning trends: infrastructure spending had grown 300% over 18 months, but active application usage had only increased 150%. Somewhere in their AWS accounts, significant resources were running without clear business justification.&lt;/p&gt;

&lt;p&gt;The breaking point came during a routine security audit when Maya discovered development environments from discontinued features still consuming production-grade resources. A proof-of-concept integration with a canceled partner was running three EC2 instances in multiple regions. Load testing infrastructure from six months ago was still provisioned "just in case."&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We're paying for the infrastructure equivalent of zombie apocalypse," Alex observed during the executive team meeting. "Resources that should be dead but keep consuming our budget."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The root problem wasn't technical—it was organizational. Product teams would request infrastructure for new features or experiments, but no clear process existed for determining when resources could be safely decommissioned. DevOps handled the technical provisioning, but Product Owners made business decisions about feature continuation. Finance tracked overall spending but couldn't map costs to specific business initiatives.&lt;/p&gt;

&lt;p&gt;Without systematic lifecycle management, MyCoCo was hemorrhaging money on infrastructure that no longer served business purposes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: MyCoCo's Systematic Lifecycle Framework
&lt;/h2&gt;

&lt;p&gt;Rather than implementing expensive tooling, MyCoCo focused on organizational process improvements that would scale with their growing team. The solution centered on three core components: clear accountability, structured communication, and systematic decommissioning.&lt;/p&gt;

&lt;h3&gt;
  
  
  RACI Accountability for Infrastructure Decisions
&lt;/h3&gt;

&lt;p&gt;MyCoCo established RACI (Responsible, Accountable, Consulted, Informed) matrices that eliminated confusion about infrastructure ownership. DevOps Engineers remained responsible for technical provisioning and maintenance while Product Owners became accountable for business decisions including feature lifecycle and resource needs. Finance teams owned cost optimization and budget compliance, with everyone staying informed about major changes.&lt;/p&gt;

&lt;p&gt;Systematic resource tagging became essential for financial allocation, incident escalation, and business alignment. Without proper tagging, teams waste hours determining resource ownership and accountability while Finance struggles with accurate cost allocation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Critical Enterprise Practice:&lt;/strong&gt; MyCoCo used team-based and role-based identifiers rather than individual names for Owner, CreatedBy, and BusinessContact tags. Just like using service emails (&lt;a href="mailto:devops@mycoco.com"&gt;devops@mycoco.com&lt;/a&gt;) instead of individual email addresses, this approach ensured tags remained valid when team members changed roles or left the organization.&lt;/p&gt;

&lt;h3&gt;
  
  
  MyCoCo's Essential Infrastructure Tags
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tag Name&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Example Values&lt;/th&gt;
&lt;th&gt;Why Essential&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Owner&lt;/td&gt;
&lt;td&gt;Identifies the team responsible for ongoing maintenance and decisions&lt;/td&gt;
&lt;td&gt;team-analytics, team-platform&lt;/td&gt;
&lt;td&gt;Essential for escalation and accountability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CostCenter&lt;/td&gt;
&lt;td&gt;Maps resources to budget allocation for financial tracking and chargeback&lt;/td&gt;
&lt;td&gt;cost-center-analytics, cost-center-platform&lt;/td&gt;
&lt;td&gt;Required for accurate cost allocation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Environment&lt;/td&gt;
&lt;td&gt;Distinguishes dev/staging/production resources&lt;/td&gt;
&lt;td&gt;dev, staging, production&lt;/td&gt;
&lt;td&gt;Lifecycle management and risk assessment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BusinessUnit&lt;/td&gt;
&lt;td&gt;Groups resources by organizational structure&lt;/td&gt;
&lt;td&gt;product-eng, data-platform, security&lt;/td&gt;
&lt;td&gt;Executive reporting and portfolio management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Application&lt;/td&gt;
&lt;td&gt;Identifies specific applications or services&lt;/td&gt;
&lt;td&gt;mycoco-projects, mycoco-analytics&lt;/td&gt;
&lt;td&gt;Dependency mapping and impact analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Project&lt;/td&gt;
&lt;td&gt;Links resources to business initiatives&lt;/td&gt;
&lt;td&gt;q4-migration, customer-portal-v2&lt;/td&gt;
&lt;td&gt;ROI tracking and project cost management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CreatedBy&lt;/td&gt;
&lt;td&gt;Tracks which team provisioned the resource&lt;/td&gt;
&lt;td&gt;devops-team, platform-team, security-team&lt;/td&gt;
&lt;td&gt;Operational context and troubleshooting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BusinessContact&lt;/td&gt;
&lt;td&gt;Provides service email for business decision escalation&lt;/td&gt;
&lt;td&gt;
&lt;a href="mailto:analytics-team@mycoco.com"&gt;analytics-team@mycoco.com&lt;/a&gt;, &lt;a href="mailto:platform@mycoco.com"&gt;platform@mycoco.com&lt;/a&gt;
&lt;/td&gt;
&lt;td&gt;Business decision escalation when technical teams unavailable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DecommissionBy&lt;/td&gt;
&lt;td&gt;Sets planned lifecycle end date&lt;/td&gt;
&lt;td&gt;2024-12-31, 2025-06-15 (ISO 8601 format)&lt;/td&gt;
&lt;td&gt;Proactive resource management and cost optimization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Criticality&lt;/td&gt;
&lt;td&gt;Defines business impact level&lt;/td&gt;
&lt;td&gt;critical, high, medium, low&lt;/td&gt;
&lt;td&gt;Incident prioritization and maintenance planning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ManagedBy&lt;/td&gt;
&lt;td&gt;Indicates management tool&lt;/td&gt;
&lt;td&gt;terraform, manual, cloudformation&lt;/td&gt;
&lt;td&gt;Operational context and change control procedures&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;MyCoCo implemented this comprehensive tagging through provider-level configuration, ensuring that every resource automatically inherited consistent ownership and lifecycle metadata:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Provider-level tagging strategy ensuring ALL resources have ownership tracking&lt;/span&gt;
&lt;span class="k"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"aws"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ca-central-1"&lt;/span&gt;

  &lt;span class="nx"&gt;default_tags&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;Owner&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default_owner&lt;/span&gt;
      &lt;span class="nx"&gt;CostCenter&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cost_center&lt;/span&gt;
      &lt;span class="nx"&gt;Environment&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environment&lt;/span&gt;
      &lt;span class="nx"&gt;BusinessUnit&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;business_unit&lt;/span&gt;
      &lt;span class="nx"&gt;Application&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;application_name&lt;/span&gt;
      &lt;span class="nx"&gt;Project&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_name&lt;/span&gt;
      &lt;span class="nx"&gt;CreatedBy&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;created_by_team&lt;/span&gt;
      &lt;span class="nx"&gt;BusinessContact&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;business_contact&lt;/span&gt;
      &lt;span class="nx"&gt;DecommissionBy&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default_sunset_date&lt;/span&gt;
      &lt;span class="nx"&gt;Criticality&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;criticality_level&lt;/span&gt;
      &lt;span class="nx"&gt;ManagedBy&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This provider-level tagging strategy ensured that every resource including EC2 instances, RDS databases, S3 buckets, and Lambda functions automatically inherited consistent ownership and lifecycle metadata.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stakeholder Visibility and Self-Service Access
&lt;/h3&gt;

&lt;p&gt;To provide visibility for stakeholders, MyCoCo created an auditor role with readonly permissions accessed through SSO. This self-service approach eliminated the weekly "who owns this resource?" questions by giving Product Owners and Finance teams direct access to resource inventory and cost data.&lt;/p&gt;

&lt;p&gt;Stakeholders use AWS Resource Groups and Cost Explorer with tag filters to view their specific resources and costs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Example: Analytics team viewing their resources via AWS CLI (readonly access)&lt;/span&gt;
aws resourcegroupstaggingapi get-resources &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tag-filters&lt;/span&gt; &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Owner,Values&lt;span class="o"&gt;=&lt;/span&gt;team-analytics &lt;span class="nv"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Project,Values&lt;span class="o"&gt;=&lt;/span&gt;analytics &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-type-filters&lt;/span&gt; EC2 RDS S3

&lt;span class="c"&gt;# Results show all resources owned by analytics team&lt;/span&gt;
&lt;span class="c"&gt;# - i-0abc123 (EC2 instance, environment=production)&lt;/span&gt;
&lt;span class="c"&gt;# - db-xyz789 (RDS instance, environment=staging)&lt;/span&gt;
&lt;span class="c"&gt;# - analytics-data-bucket (S3 bucket, environment=production)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Structured Communication Cadences
&lt;/h3&gt;

&lt;p&gt;MyCoCo implemented three meeting rhythms that transformed ad-hoc coordination into systematic planning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Weekly operational reviews&lt;/strong&gt; (30 minutes): utilization metrics, cost anomalies, and upcoming resource needs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monthly strategic planning sessions&lt;/strong&gt; (90 minutes): capacity planning, budget variance analysis, and technology roadmap alignment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quarterly business reviews&lt;/strong&gt; (half-day): lifecycle optimization and process improvements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key insight was treating these meetings as investment reviews rather than status updates, focusing on ROI and business value rather than purely technical metrics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Automated Resource Discovery and Systematic Decommissioning
&lt;/h3&gt;

&lt;p&gt;MyCoCo developed a systematic approach to resource retirement that prevented both premature shutdowns and indefinite resource sprawl.&lt;/p&gt;

&lt;p&gt;To automate the discovery of resources approaching their decommission dates, MyCoCo implemented Cloud Custodian - an open-source tool for managing cloud resources through policy-as-code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Cloud Custodian policy for automated discovery&lt;/span&gt;
&lt;span class="na"&gt;policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;identify-unused-instances&lt;/span&gt;
    &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws.ec2&lt;/span&gt;
    &lt;span class="na"&gt;filters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag:DecommissionBy"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;present&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;value&lt;/span&gt;
        &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag:DecommissionBy"&lt;/span&gt;
        &lt;span class="na"&gt;value_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;date&lt;/span&gt;
        &lt;span class="na"&gt;op&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;less-than&lt;/span&gt;
        &lt;span class="na"&gt;value_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;age&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;7&lt;/span&gt;
    &lt;span class="na"&gt;actions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;notify&lt;/span&gt;
        &lt;span class="na"&gt;transport&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sns&lt;/span&gt;
          &lt;span class="na"&gt;topic&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn:aws:sns:us-east-1:123456789:infrastructure-alerts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When Cloud Custodian runs this policy, it generates output identifying specific resources that should be reviewed for decommissioning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Found 12 resources &lt;span class="k"&gt;for &lt;/span&gt;policy identify-unused-instances:
- i-0abc123def456789 &lt;span class="o"&gt;(&lt;/span&gt;project-alpha-dev&lt;span class="o"&gt;)&lt;/span&gt; - DecommissionBy: 2024-08-15
- i-0def456abc123789 &lt;span class="o"&gt;(&lt;/span&gt;feature-beta-test&lt;span class="o"&gt;)&lt;/span&gt; - DecommissionBy: 2024-08-12
- i-0789123abc456def &lt;span class="o"&gt;(&lt;/span&gt;integration-poc&lt;span class="o"&gt;)&lt;/span&gt; - DecommissionBy: 2024-08-10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Resources identified by Cloud Custodian are then discussed during the structured communication cadences, ensuring that business context and technical dependencies are properly evaluated before any decommissioning decisions are made.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results: MyCoCo's Infrastructure Transformation
&lt;/h2&gt;

&lt;p&gt;The systematic approach delivered immediate and sustained improvements:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost Reduction:&lt;/strong&gt; MyCoCo eliminated &lt;strong&gt;$180,000 in annual infrastructure waste&lt;/strong&gt; within the first quarter, primarily from development environments and discontinued feature infrastructure that had been running indefinitely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Operational Efficiency:&lt;/strong&gt; Infrastructure requests were resolved &lt;strong&gt;60% faster&lt;/strong&gt; through clear ownership and approval processes. Product teams gained visibility into infrastructure costs associated with their features, leading to more informed technical decisions and natural cost consciousness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Team Impact:&lt;/strong&gt; DevOps team stress decreased significantly as reactive "emergency" requests became planned initiatives coordinated through structured processes. Product Owners developed infrastructure awareness that improved their technical roadmap planning.&lt;/p&gt;

&lt;p&gt;Most importantly, the framework scaled with business growth. As MyCoCo expanded into new markets and launched additional product lines, the lifecycle management processes handled increased complexity without breaking down.&lt;/p&gt;

&lt;p&gt;Six months after implementation, Maya's security audits revealed zero zombie infrastructure, and Finance teams could accurately map 95% of infrastructure costs to specific business initiatives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with Organizational Process&lt;/strong&gt;: Technical tools cannot fix unclear accountability or poor communication. Establish RACI matrices and communication cadences before implementing automation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Implement Provider-Level Tagging&lt;/strong&gt;: Use Terraform default_tags to ensure ALL resources inherit ownership and lifecycle metadata automatically. This eliminates manual tagging gaps and enables comprehensive automated discovery.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable Stakeholder Self-Service&lt;/strong&gt;: Provide readonly access to resource inventory and cost data through auditor roles, eliminating DevOps dependency for basic visibility while empowering teams to track their own infrastructure footprint.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Automate Discovery, Not Decisions&lt;/strong&gt;: Use Cloud Custodian and tagging strategies to identify optimization opportunities, but maintain human oversight for business impact assessment and final approval.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Treat Infrastructure as Investment Portfolio&lt;/strong&gt;: Regular review cycles with Finance, Product, and DevOps teams ensure resources align with business priorities and ROI expectations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Scale Process with Team Growth&lt;/strong&gt;: Systematic lifecycle management becomes more valuable as teams grow, preventing the organizational chaos that destroys productivity at enterprise scale.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;For teams managing infrastructure across multiple product lines, lifecycle management transforms reactive operations into strategic capability that enables sustainable scaling. The key is starting with clear accountability and structured communication—the automation becomes straightforward once ownership is established.&lt;/p&gt;

&lt;p&gt;Ready to eliminate infrastructure waste in your organization? Start with clear accountability frameworks and structured communication processes. Your bottom line will thank you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your biggest challenge with infrastructure lifecycle management? Have you dealt with zombie resources at your company? Share your experiences and strategies in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>costoptimization</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>Breaking the Hero Culture: Why Sam's Burnout Nearly Killed MyCoCo's Growth</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Fri, 19 Sep 2025 16:30:03 +0000</pubDate>
      <link>https://forem.com/dc-shimla/breaking-the-hero-culture-why-sams-burnout-nearly-killed-our-company-growth-3eg6</link>
      <guid>https://forem.com/dc-shimla/breaking-the-hero-culture-why-sams-burnout-nearly-killed-our-company-growth-3eg6</guid>
      <description>&lt;p&gt;When your best engineer becomes your biggest bottleneck, company growth stalls and critical knowledge becomes dangerously concentrated. MyCoCo learned this lesson when their "DevOps hero" Sam—who single-handedly kept infrastructure running—burned out during a critical growth phase. By dismantling their hero culture and building sustainable team practices, they transformed from a fragile single-point-of-failure operation to a resilient engineering organization that could actually scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt; Sam, MyCoCo's original DevOps engineer, had become the single point of failure—working 70-hour weeks, sole keeper of infrastructure knowledge, and the only person who could fix production issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Crisis:&lt;/strong&gt; Sam's burnout during a major product launch left MyCoCo unable to deploy fixes for 4 days, nearly losing their largest enterprise customer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; Systematic knowledge distribution through documentation requirements, forced rotation of responsibilities, and cultural shift from rewarding heroes to building resilient teams.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Impact:&lt;/strong&gt; Reduced incident resolution time by 60%, eliminated single points of failure, and Sam actually took his first real vacation in 3 years.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom Line:&lt;/strong&gt; If one person's absence can cripple your operations, you're not building a company—you're building a house of cards.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: Sam's Impossible Position
&lt;/h2&gt;

&lt;p&gt;By late 2023, Sam had evolved into MyCoCo's superhero. As employee #3 and the original "DevOps person," he'd built every piece of infrastructure from scratch. He knew why that strange cron job ran at 3:47 AM (it avoided a rate limit issue from 2021). He understood the intricate dance of microservices that somehow kept MyCoCo Support running despite architectural decisions everyone now regretted.&lt;/p&gt;

&lt;p&gt;"Sam will know" became the company motto. Mysterious production issue? Sam fixed it. Deployment failed? Sam had a workaround. Customer-specific infrastructure need? Sam would handle it over the weekend.&lt;/p&gt;

&lt;p&gt;The warning signs were obvious in hindsight. Sam's Slack status permanently showed 🔥. He answered infrastructure questions at 11 PM while ostensibly watching Netflix. His "quick fixes" accumulated into an elaborate system only he understood. His vacation days rolled over year after year, unused because "what if something breaks?"&lt;/p&gt;

&lt;p&gt;Maya (Security Engineer) noticed it during her security audit prep:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Sam, who else knows how to rotate these certificates?" she asked. His pause said everything. "I've been meaning to document that..."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Alex (VP of Engineering) saw it in sprint planning. Every infrastructure task had Sam's name:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We can't parallelize if everything goes through one person," he pointed out. Sam's response: "It's faster if I just do it myself than explain the whole history."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then came Black Friday 2023. MyCoCo's biggest sales event. Sam had been working 16-hour days preparing. On Thursday night, exhausted and making mistakes, he pushed a config change that broke production deployments. Then his laptop died. The replacement wouldn't arrive until Monday.&lt;/p&gt;

&lt;p&gt;For four days, MyCoCo couldn't deploy fixes. Customer tickets piled up. Their largest enterprise client threatened to leave. Jordan (Platform Engineer) and the platform team tried to help but couldn't decipher Sam's intricate web of bash scripts, environment variables, and "temporary" workarounds.&lt;/p&gt;

&lt;p&gt;Drew (CTO) finally called an emergency leadership meeting:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We've built our entire infrastructure on one person's brain. This isn't sustainable—it's negligent."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Solution: Dismantling Hero Culture
&lt;/h2&gt;

&lt;p&gt;The transformation started with an uncomfortable truth: MyCoCo had been rewarding the wrong behavior. They celebrated Sam's heroics in all-hands meetings. They praised his weekend work. They had inadvertently created a culture where being indispensable was valued over building resilient systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1: Immediate Knowledge Transfer (Week 1-2)
&lt;/h3&gt;

&lt;p&gt;Alex instituted "Sam shadowing sessions." Every task Sam touched, someone watched and documented. Not detailed documentation—just enough to prevent total helplessness. Jordan paired with Sam on every production fix, asking "why" at each step.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 2: Forced Distribution (Week 3-8)
&lt;/h3&gt;

&lt;p&gt;The hardest change: Sam was banned from being first responder. When production issues arose, others had to try first. Sam could advise but not touch the keyboard. The first week was painful—a 15-minute Sam fix became a 2-hour team effort. But each incident built team knowledge.&lt;/p&gt;

&lt;p&gt;They instituted "documentation-driven operations"—if it wasn't documented, it didn't exist. No more "Sam knows" or "ask Sam." Every system needed a runbook that a new engineer could follow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3: Cultural Reset (Week 9-16)
&lt;/h3&gt;

&lt;p&gt;Drew changed how they recognized work. Instead of celebrating weekend heroics, they celebrated knowledge sharing. The monthly MVP award went to whoever best distributed their expertise. On-call rotations became mandatory—everyone, including Alex, took shifts.&lt;/p&gt;

&lt;p&gt;They implemented "chaos days"—randomly chosen team members had to handle infrastructure tasks they'd never done before, with documentation as their only guide. Gaps in knowledge became visible and fixable before they became crises.&lt;/p&gt;

&lt;p&gt;Most importantly, they normalized boundaries. Sam was required to take vacation—a real one, with laptop left at home. The first time, he checked Slack every hour. By the third vacation, he actually relaxed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results: From Fragility to Resilience
&lt;/h2&gt;

&lt;p&gt;Six months later, MyCoCo's infrastructure culture had transformed:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Incident Response:&lt;/strong&gt; Mean time to resolution dropped from 2 hours to 45 minutes—because five people could investigate in parallel instead of waiting for Sam.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment Velocity:&lt;/strong&gt; Ship rate increased 40% as infrastructure work no longer bottlenecked through one person.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Team Growth:&lt;/strong&gt; They successfully onboarded three new platform engineers who became productive within weeks, not months.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Sam Factor:&lt;/strong&gt; Sam rediscovered why he loved technology. Instead of fighting fires, he led architectural improvements. His stress levels dropped. He took a two-week vacation to visit family in Germany—and production didn't even hiccup.&lt;/p&gt;

&lt;p&gt;The real test came during their Series B due diligence. When investors asked about infrastructure bus factor, Drew could confidently say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Any three of our eight platform engineers could rebuild our entire system from documentation."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hero Culture is Organizational Debt&lt;/strong&gt;: Every time you celebrate someone working weekends to save the day, you're borrowing against your future resilience.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Knowledge Hoarding Isn't Job Security&lt;/strong&gt;: Sam thought being indispensable made him valuable. In reality, it trapped him in operational work instead of strategic growth.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sustainable Pace Beats Heroic Sprints&lt;/strong&gt;: MyCoCo now ships more with nobody working weekends than they did during Sam's 70-hour weeks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Documentation is Cheaper Than Burnout&lt;/strong&gt;: The time invested in runbooks and knowledge sharing paid for itself the first time someone other than Sam resolved a production issue.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Resilience Requires Redundancy&lt;/strong&gt;: If your organization can't survive someone's two-week vacation, you're not building a company—you're managing a crisis waiting to happen.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Breaking hero culture isn't about diminishing individual contributions—it's about building systems that amplify everyone's impact. Sam didn't become less valuable when knowledge was distributed; he became free to work on problems that actually required his expertise.&lt;/p&gt;

&lt;p&gt;Ready to break your own hero culture? Start by identifying your critical knowledge silos and implementing documentation requirements for every operational task. Your "heroes" will thank you, and your business will become truly resilient.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your experience with hero culture at your company? How did you handle knowledge silos? Share your thoughts in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>burnout</category>
      <category>teamwork</category>
      <category>engineeringculture</category>
    </item>
    <item>
      <title>How AWS Nuke Saved MyCoCo $10K Monthly by Automating Resource Cleanup</title>
      <dc:creator>Dhruv Chaudhary</dc:creator>
      <pubDate>Fri, 19 Sep 2025 16:19:33 +0000</pubDate>
      <link>https://forem.com/dc-shimla/how-aws-nuke-saved-us-10k-monthly-by-automating-resource-cleanup-3b05</link>
      <guid>https://forem.com/dc-shimla/how-aws-nuke-saved-us-10k-monthly-by-automating-resource-cleanup-3b05</guid>
      <description>&lt;p&gt;When MyCoCo's AWS bill jumped from $18K to $28K monthly due to forgotten non-production resources and abandoned team sandbox experiments, they discovered that manual cleanup wasn't scalable for a growing engineering organization. AWS Nuke became their automated solution for systematically destroying unused infrastructure while protecting critical resources. Here's how they implemented safe, automated resource cleanup that reduced costs by over one-third without risking production.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt; Development teams creating and forgetting AWS resources in non-production accounts (development and staging), plus abandoned team sandbox experiments, causing $10K monthly waste across 15 AWS accounts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; AWS Nuke with strict configuration rules, account filtering, and automated scheduling to safely destroy unused resources in non-production and sandbox environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Impact:&lt;/strong&gt; MyCoCo reduced AWS costs by 36%, eliminated manual cleanup overhead, and improved non-production environment lifecycle management.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Implementation:&lt;/strong&gt; Account-specific Nuke configurations with production safeguards, resource filtering, and CI/CD integration for scheduled cleanup across non-production and sandbox accounts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bottom Line:&lt;/strong&gt; Automate resource destruction with the same rigor as resource creation - AWS Nuke makes infrastructure cleanup safe and systematic.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge: MyCoCo's Resource Sprawl Crisis
&lt;/h2&gt;

&lt;p&gt;The wake-up call came during MyCoCo's monthly cost review. Sam (Senior DevOps Engineer) stared at an AWS bill that had grown by 56% in three months without any corresponding increase in customer traffic.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We have hundreds of unattached EBS volumes accumulating across our accounts," Sam announced during the engineering meeting. "Our non-production environments have snapshots dating back months that no one remembers creating. I found 50GB of unattached storage that's been sitting unused for months in one of our development accounts."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But the non-production accounts were only part of the problem. MyCoCo had embraced a team-based sandbox account strategy where each product team received their own dedicated AWS account for experimentation and learning. While this improved security isolation and prevented accidental cross-contamination between teams, it also created a resource sprawl nightmare.&lt;/p&gt;

&lt;p&gt;Jordan (Platform Engineer) dug deeper into the cost analysis:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"68% of our AWS spend is in non-production and sandbox accounts. The team sandbox accounts alone are costing us $7K monthly with forgotten compute resources. We're paying $1,800 monthly for RDS instances that haven't had a connection in weeks. There's a $600/month Elasticsearch cluster someone created for a POC that was abandoned. One team's sandbox account has been running a GPU instance for machine learning experiments since January - that's $2,400 we forgot about."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The sandbox problem was particularly challenging because these accounts were designed for team experimentation. Product teams would spin up expensive resources to test new technologies, complete their evaluation, and move on to other projects. Unlike non-production environments that followed project lifecycles, team sandbox experiments had no natural cleanup triggers.&lt;/p&gt;

&lt;p&gt;The manual cleanup process was broken. Teams would create resources for testing, finish their work, and forget to delete them. The monthly "cleanup reminders" were ignored because manually identifying and deleting resources was time-consuming and error-prone. Worse, team members were afraid of deleting resources that might still be needed by teammates.&lt;/p&gt;

&lt;p&gt;Maya (Security Engineer) raised the compliance concern:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We have orphaned resources with overly permissive security groups sitting in forgotten accounts. These are security risks we're paying for."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Solution: AWS Nuke with Surgical Precision
&lt;/h2&gt;

&lt;p&gt;MyCoCo's solution centered on AWS Nuke - a tool designed to systematically delete AWS resources based on configurable rules. But implementing it safely required recognizing that different account types need fundamentally different approaches: non-production environments support ongoing project work and need persistence, while sandbox accounts are for experimentation and benefit from aggressive cleanup.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Account Protection and Strategy
&lt;/h3&gt;

&lt;p&gt;First, they established clear account boundaries with absolute protection for production:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# nuke-config.yaml - Account protection and strategy&lt;/span&gt;
&lt;span class="na"&gt;regions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ca-central-1&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;us-east-2&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;global&lt;/span&gt;

&lt;span class="na"&gt;blocklist&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;111111111111"&lt;/span&gt; &lt;span class="c1"&gt;# Production account - absolute protection&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;666666666666"&lt;/span&gt; &lt;span class="c1"&gt;# Shared services account&lt;/span&gt;

&lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;222222222222"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# Staging&lt;/span&gt;
    &lt;span class="na"&gt;presets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;non-prod-minimal-cleanup"&lt;/span&gt;
&lt;span class="err"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;333333333333"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# Development&lt;/span&gt;
    &lt;span class="na"&gt;presets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;non-prod-minimal-cleanup"&lt;/span&gt;
&lt;span class="err"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;444444444444"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# Team Sandbox A&lt;/span&gt;
    &lt;span class="na"&gt;presets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sandbox-aggressive"&lt;/span&gt;
&lt;span class="err"&gt;  &lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;555555555555"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# Team Sandbox B&lt;/span&gt;
    &lt;span class="na"&gt;presets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sandbox-aggressive"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Non-Production Account Strategy - Minimal Cleanup Approach
&lt;/h3&gt;

&lt;p&gt;Non-production accounts support active projects that span weeks or months. Sam designed a minimal approach that protects all managed infrastructure while only cleaning up clearly abandoned manual storage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;presets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;non-prod-minimal-cleanup&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;filters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;__global__&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Manual protection tag for unmanaged resources that need retention&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag:ignore-nuke"&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
        &lt;span class="c1"&gt;# Protect all managed infrastructure&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag:CreatedBy"&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;terraform"&lt;/span&gt;

      &lt;span class="c1"&gt;# For unmanaged EBS volumes: only delete if unattached AND older than 1 week&lt;/span&gt;
      &lt;span class="na"&gt;EBSVolume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;State"&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attached"&lt;/span&gt; &lt;span class="c1"&gt;# Protect all attached volumes&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CreateTime"&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dateOlderThan"&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;168h"&lt;/span&gt;
          &lt;span class="na"&gt;invert&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="c1"&gt;# Protect volumes newer than 1 week&lt;/span&gt;

      &lt;span class="c1"&gt;# For unmanaged snapshots: only delete if older than 1 week&lt;/span&gt;
      &lt;span class="na"&gt;EBSSnapshot&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;StartTime"&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dateOlderThan"&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;168h"&lt;/span&gt;
          &lt;span class="na"&gt;invert&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="c1"&gt;# Protect snapshots newer than 1 week&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach protects all managed infrastructure (Terraform resources with the &lt;code&gt;CreatedBy=terraform&lt;/code&gt; tag configured at the provider level) and all manually protected resources. Only unmanaged storage resources that are clearly abandoned get cleaned up - unattached EBS volumes older than 1 week and unmanaged snapshots older than 1 week.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Sandbox Account Strategy - Aggressive Experimentation Cleanup
&lt;/h3&gt;

&lt;p&gt;Sandbox accounts are different - they're designed for short-term experiments and POCs. Here, aggressive cleanup actually helps teams by removing the cognitive load of manual resource management:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;presets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;sandbox-aggressive&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;filters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="c1"&gt;# Only protect landing zone infrastructure and manually tagged experiments&lt;/span&gt;
      &lt;span class="na"&gt;__global__&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag:landing-zone"&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;property&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag:ignore-nuke"&lt;/span&gt;
          &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This dramatically simpler configuration deletes everything except:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Landing zone infrastructure (VPCs, subnets, internet gateways) tagged during account setup&lt;/li&gt;
&lt;li&gt;Resources manually protected by team members for longer experiments&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4: Automated Scheduling with Different Cadences
&lt;/h3&gt;

&lt;p&gt;Jordan automated the cleanup process using separate GitHub Actions workflows for each cleanup strategy. The workflows authenticate to AWS using OpenID Connect (OIDC) with dedicated IAM roles that have the necessary permissions for AWS Nuke operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/sandbox-cleanup.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Daily Sandbox Cleanup&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;cron&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;6&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt; &lt;span class="c1"&gt;# Daily at 6 AM&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;nuke-sandbox-accounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt; &lt;span class="c1"&gt;# Required for OIDC authentication&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
    &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;account&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;444444444444"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;555555555555"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# Team sandbox accounts&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="c1"&gt;# Configure AWS credentials using OIDC&lt;/span&gt;
      &lt;span class="c1"&gt;# Download and execute AWS Nuke with configuration&lt;/span&gt;
      &lt;span class="c1"&gt;# See AWS and GitHub Actions documentation for implementation&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/non-prod-cleanup.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Weekly Non-Prod Cleanup&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;cron&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;6&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;SUN"&lt;/span&gt; &lt;span class="c1"&gt;# Weekly on Sunday at 6 AM&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;nuke-non-prod-accounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt; &lt;span class="c1"&gt;# Required for OIDC authentication&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
    &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;account&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;222222222222"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;333333333333"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# Non-prod accounts&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="c1"&gt;# Configure AWS credentials using OIDC&lt;/span&gt;
      &lt;span class="c1"&gt;# Download and execute AWS Nuke with configuration&lt;/span&gt;
      &lt;span class="c1"&gt;# See AWS and GitHub Actions documentation for implementation&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Separate workflows provide better isolation, easier troubleshooting, and independent scheduling without complex conditional logic. The OIDC setup eliminates the need for long-lived AWS access keys while providing secure, temporary credentials for each workflow run.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Cost Monitoring
&lt;/h3&gt;

&lt;p&gt;Maya set up simple cost tracking using AWS Cost Explorer to monitor the effectiveness of their cleanup efforts, comparing monthly spending before and after implementing automated resource deletion.&lt;/p&gt;

&lt;p&gt;To handle legitimate long-running experiments, they created a simple manual tagging strategy. Team members could simply add an &lt;code&gt;ignore-nuke=true&lt;/code&gt; tag to any resource that needed to run longer than the standard cleanup cycle. Additionally, the landing zone infrastructure (VPCs, subnets, internet gateways, route tables) was pre-tagged with &lt;code&gt;landing-zone=true&lt;/code&gt; during account setup to ensure basic networking remained intact. This gave teams a self-service way to protect their resources without requiring scripts or special permissions - just standard AWS resource tagging.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results: MyCoCo's Automated Cleanup Success
&lt;/h2&gt;

&lt;p&gt;The transformation was dramatic. Within three months, MyCoCo achieved unprecedented infrastructure consistency and cost control.&lt;/p&gt;

&lt;h3&gt;
  
  
  Financial Impact
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;AWS costs reduced from $28K to $18K monthly (&lt;strong&gt;36% reduction&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$10K in monthly savings&lt;/strong&gt; from automated cleanup&lt;/li&gt;
&lt;li&gt;Sandbox account costs dropped from $7K to $3K monthly&lt;/li&gt;
&lt;li&gt;Immediate cost savings visible within the first cleanup cycle&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Operational Improvements
&lt;/h3&gt;

&lt;p&gt;The automated approach eliminated the manual cleanup burden entirely. Non-production teams never worried about their running infrastructure being touched - AWS Nuke only cleaned up forgotten storage. The protection tagging system gave teams control when they needed to keep resources longer.&lt;/p&gt;

&lt;p&gt;The sandbox account cleanup was particularly transformative. The simplified approach of "delete everything except landing zone infrastructure" eliminated the complexity of managing different retention policies for different resource types. Daily automated cleanup meant teams could experiment freely without cost anxiety, knowing that only the basic networking foundation would persist.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"It's like having a responsible roommate who cleans up after everyone," Sam commented during their retrospective.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The daily sandbox cleanup cycles meant that experimental resource costs never accumulated beyond a day, while non-production environments had their compute resources completely protected with only storage waste removed weekly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Development Velocity
&lt;/h3&gt;

&lt;p&gt;Paradoxically, automated destruction improved development velocity. Teams felt more comfortable creating experimental resources knowing they would be automatically cleaned up. The fear of creating expensive infrastructure waste was eliminated.&lt;/p&gt;

&lt;p&gt;Maya noted improved security posture:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We went from having 150+ unknown resources scattered across accounts to maintaining clean, auditable environments. Every resource running has a purpose and an expiration date."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use the Maintained Version&lt;/strong&gt;: Always use the actively maintained &lt;code&gt;ekristen/aws-nuke&lt;/code&gt; fork rather than the archived original repository. The maintained version includes security fixes and ongoing AWS service support.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with Account Protection&lt;/strong&gt;: Use blocklists to absolutely protect production accounts. Never rely on filters alone to protect critical infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tailor Strategies by Account Type&lt;/strong&gt;: Non-production accounts need minimal cleanup that only touches abandoned storage, while sandbox accounts benefit from aggressive "delete everything except landing zone" approaches.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Focus Age-Based Cleanup on Sandbox Accounts&lt;/strong&gt;: Non-production environments often need to persist for ongoing project work. Reserve aggressive time-based cleanup for true sandbox/experimental accounts where resources are more likely to be forgotten.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Simple Tagging Works&lt;/strong&gt;: Resource tagging provides comprehensive protection through three key tags:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CreatedBy=terraform&lt;/code&gt; automatically protects all managed infrastructure&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ignore-nuke=true&lt;/code&gt; gives teams manual protection for unmanaged resources&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;landing-zone=true&lt;/code&gt; preserves essential networking infrastructure&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monitor and Measure&lt;/strong&gt;: Track cost savings and resource reduction to demonstrate value. Automated cleanup should show immediate cost benefits within the first cleanup cycle.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Test Extensively&lt;/strong&gt;: Always start with dry runs and gradually implement automation. The tool's power makes careful testing essential.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AWS Nuke transforms resource cleanup from a manual chore into an automated process that keeps environments clean and costs predictable. The key is implementing it safely with comprehensive protection mechanisms and clear boundaries between environments.&lt;/p&gt;

&lt;p&gt;Ready to implement automated AWS resource cleanup? Start with AWS Nuke's account protection features and gradually build comprehensive cleanup strategies for your non-production and sandbox environments.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your experience with AWS cost optimization? Have you tried automated resource cleanup tools? Share your thoughts in the comments below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>costoptimization</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
