<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Kehinde Ogunlowo</title>
    <description>The latest articles on Forem by Kehinde Ogunlowo (@kogunlowo123).</description>
    <link>https://forem.com/kogunlowo123</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3812123%2F604d88cd-050b-4d1a-935b-1a6b3f5ef514.jpeg</url>
      <title>Forem: Kehinde Ogunlowo</title>
      <link>https://forem.com/kogunlowo123</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kogunlowo123"/>
    <language>en</language>
    <item>
      <title>The Complete Guide to Production EKS with Terraform</title>
      <dc:creator>Kehinde Ogunlowo</dc:creator>
      <pubDate>Sun, 08 Mar 2026 18:56:56 +0000</pubDate>
      <link>https://forem.com/kogunlowo123/the-complete-guide-to-production-eks-with-terraform-f80</link>
      <guid>https://forem.com/kogunlowo123/the-complete-guide-to-production-eks-with-terraform-f80</guid>
      <description>&lt;h2&gt;
  
  
  "Production-ready EKS deployment with Terraform — Karpenter autoscaling, self-healing nodes, pod security standards, and multi-AZ high availability."
&lt;/h2&gt;

&lt;p&gt;EKS is the most popular managed Kubernetes service, but most deployments I've seen in production audits are dangerously under-configured. Missing node auto-remediation, no pod security standards, manual scaling — the list goes on.&lt;/p&gt;

&lt;p&gt;This guide covers everything you need for production EKS.&lt;/p&gt;

&lt;h2&gt;
  
  
  EKS vs AKS vs GKE
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;EKS&lt;/th&gt;
&lt;th&gt;AKS&lt;/th&gt;
&lt;th&gt;GKE&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Control Plane Cost&lt;/td&gt;
&lt;td&gt;$0.10/hr&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;Free (Standard)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Autopilot Mode&lt;/td&gt;
&lt;td&gt;No (use Karpenter)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Node Auto-Repair&lt;/td&gt;
&lt;td&gt;Manual/Lambda&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Service Mesh&lt;/td&gt;
&lt;td&gt;App Mesh / Istio&lt;/td&gt;
&lt;td&gt;Istio&lt;/td&gt;
&lt;td&gt;Anthos / Istio&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU Support&lt;/td&gt;
&lt;td&gt;p4d, g5&lt;/td&gt;
&lt;td&gt;NC, ND series&lt;/td&gt;
&lt;td&gt;T4, A100&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Terraform Module
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"eks"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"github.com/kogunlowo123/terraform-aws-auto-healing-eks"&lt;/span&gt;

  &lt;span class="nx"&gt;cluster_name&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production-cluster"&lt;/span&gt;
  &lt;span class="nx"&gt;cluster_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1.29"&lt;/span&gt;
  &lt;span class="nx"&gt;vpc_id&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc_id&lt;/span&gt;
  &lt;span class="nx"&gt;subnet_ids&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;private_subnet_ids&lt;/span&gt;

  &lt;span class="nx"&gt;node_groups&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"general"&lt;/span&gt;
    &lt;span class="nx"&gt;instance_types&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"m6i.xlarge"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"m6i.2xlarge"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;min_size&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="nx"&gt;max_size&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
    &lt;span class="nx"&gt;desired_size&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
  &lt;span class="p"&gt;}]&lt;/span&gt;

  &lt;span class="nx"&gt;enable_karpenter&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;enable_cluster_autoscaler&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;  &lt;span class="c1"&gt;# Use Karpenter instead&lt;/span&gt;
  &lt;span class="nx"&gt;enable_node_termination_handler&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;enable_auto_remediation&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Use Karpenter over Cluster Autoscaler&lt;/strong&gt; for faster scaling and better bin-packing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable pod disruption budgets&lt;/strong&gt; for every production workload&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use node termination handler&lt;/strong&gt; for spot instance graceful shutdown&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implement network policies&lt;/strong&gt; with Calico or Cilium&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable control plane logging&lt;/strong&gt; to CloudWatch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use IRSA&lt;/strong&gt; (IAM Roles for Service Accounts) instead of node-level IAM&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Open Source
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/kogunlowo123/terraform-aws-auto-healing-eks" rel="noopener noreferrer"&gt;terraform-aws-auto-healing-eks&lt;/a&gt; — Self-healing EKS&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/kogunlowo123/terraform-aws-eks" rel="noopener noreferrer"&gt;terraform-aws-eks&lt;/a&gt; — Standard EKS module&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/kogunlowo123/terraform-aws-vpc-complete" rel="noopener noreferrer"&gt;terraform-aws-vpc-complete&lt;/a&gt; — VPC for EKS&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full guide: &lt;a href="https://kogunlowo123.github.io/blog/terraform-eks-complete-guide.html" rel="noopener noreferrer"&gt;kogunlowo123.github.io&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>aws</category>
      <category>terraform</category>
      <category>devops</category>
    </item>
    <item>
      <title>Building Production RAG Pipelines on AWS with Bedrock and OpenSearch</title>
      <dc:creator>Kehinde Ogunlowo</dc:creator>
      <pubDate>Sun, 08 Mar 2026 18:54:25 +0000</pubDate>
      <link>https://forem.com/kogunlowo123/building-production-rag-pipelines-on-aws-with-bedrock-and-opensearch-o01</link>
      <guid>https://forem.com/kogunlowo123/building-production-rag-pipelines-on-aws-with-bedrock-and-opensearch-o01</guid>
      <description>&lt;p&gt;RAG (Retrieval-Augmented Generation) is how enterprises are deploying LLMs without fine-tuning. But most tutorials stop at the demo stage. Production RAG is a different beast entirely.&lt;/p&gt;

&lt;p&gt;Here's what production RAG actually requires — and how to build it on AWS.&lt;/p&gt;

&lt;h2&gt;
  
  
  RAG vs Fine-Tuning vs Prompt Engineering
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Data Freshness&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RAG&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Real-time&lt;/td&gt;
&lt;td&gt;High (with good retrieval)&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fine-Tuning&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Static (retraining needed)&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt Engineering&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Static&lt;/td&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The pipeline: Documents → Chunking → Embeddings → Vector Store → Query → Retrieval → LLM → Response.&lt;/p&gt;

&lt;h2&gt;
  
  
  Python Implementation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;bedrock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-east-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;opensearch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;opensearchserverless&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_knowledge_base&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;collection_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Generate embedding for the question
&lt;/span&gt;    &lt;span class="n"&gt;embed_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amazon.titan-embed-text-v2:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputText&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embed_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Search OpenSearch vector store
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;search_vectors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;collection_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="c1"&gt;# Generate answer with context
&lt;/span&gt;    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Based on the following context, answer the question.

Context: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Question: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Answer:&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic.claude-3-sonnet-20240229-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic_version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-2023-05-31&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Hallucination Mitigation
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Chunk size matters&lt;/strong&gt; — 512 tokens with 50-token overlap works best&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid search&lt;/strong&gt; — combine semantic + keyword search (BM25)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Citation grounding&lt;/strong&gt; — force the model to cite source chunks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confidence scoring&lt;/strong&gt; — filter low-relevance retrievals (cosine similarity &amp;lt; 0.7)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Guardrails&lt;/strong&gt; — use Bedrock Guardrails to block off-topic responses&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Open Source Modules
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/kogunlowo123/terraform-aws-rag-pipeline" rel="noopener noreferrer"&gt;terraform-aws-rag-pipeline&lt;/a&gt; — Complete RAG infrastructure&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/kogunlowo123/terraform-aws-bedrock-platform" rel="noopener noreferrer"&gt;terraform-aws-bedrock-platform&lt;/a&gt; — Bedrock foundation&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/kogunlowo123/terraform-aws-bedrock-agents" rel="noopener noreferrer"&gt;terraform-aws-bedrock-agents&lt;/a&gt; — Autonomous Bedrock agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full article with architecture diagrams: &lt;a href="https://kogunlowo123.github.io/blog/production-rag-pipelines-aws.html" rel="noopener noreferrer"&gt;kogunlowo123.github.io&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Kehinde Ogunlowo — Principal Multi-Cloud DevSecOps Architect at &lt;a href="https://citadelcloudmanagement.com" rel="noopener noreferrer"&gt;Citadel Cloud Management&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>aws</category>
      <category>python</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title># Building a Production-Ready AWS VPC with Terraform: Multi-Tier Subnets, NAT Gateways, and VPC Endpoints</title>
      <dc:creator>Kehinde Ogunlowo</dc:creator>
      <pubDate>Sat, 07 Mar 2026 21:20:40 +0000</pubDate>
      <link>https://forem.com/kogunlowo123/-building-a-production-ready-aws-vpc-with-terraform-multi-tier-subnets-nat-gateways-and-vpc-49bo</link>
      <guid>https://forem.com/kogunlowo123/-building-a-production-ready-aws-vpc-with-terraform-multi-tier-subnets-nat-gateways-and-vpc-49bo</guid>
      <description>&lt;p&gt;A network topology diagram showing a multi-tier VPC with public, private, database, cache, and management subnets across three availability zones, with NAT gateways and VPC endpoints illustrated.&lt;/p&gt;

&lt;p&gt;If you've ever inherited an AWS account where everything lives in the default VPC, you know the pain. Security groups used as the only network boundary. No flow logs. Public IP addresses on database instances. It's the kind of setup that keeps security teams awake at night.&lt;/p&gt;

&lt;p&gt;A well-architected VPC is the foundation of everything you build on AWS. Get it wrong, and you're retrofitting network isolation into a running production system — one of the least enjoyable exercises in cloud engineering.&lt;/p&gt;

&lt;p&gt;In this article, I'll walk through a production-grade VPC architecture using Terraform, based on a module I've been refining across multiple enterprise deployments. The full module is available at &lt;a href="https://github.com/kogunlowo123/terraform-aws-vpc-complete" rel="noopener noreferrer"&gt;terraform-aws-vpc-complete&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why VPC Architecture Matters More Than You Think
&lt;/h2&gt;

&lt;p&gt;Most teams start with a simple public/private subnet split and call it a day. That works for a proof of concept, but production workloads demand more:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Regulatory compliance&lt;/strong&gt; often requires network-level isolation between application tiers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost optimization&lt;/strong&gt; depends on keeping traffic within the AWS network via VPC endpoints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blast radius reduction&lt;/strong&gt; means a compromised web server shouldn't have a network path to your database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational visibility&lt;/strong&gt; requires flow logs that actually capture meaningful traffic patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's build something that addresses all of these.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 5-Tier Subnet Strategy
&lt;/h2&gt;

&lt;p&gt;Instead of the typical two-tier model, I use five distinct subnet tiers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Internet Access&lt;/th&gt;
&lt;th&gt;Example Workloads&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Public&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Internet-facing resources&lt;/td&gt;
&lt;td&gt;Direct (IGW)&lt;/td&gt;
&lt;td&gt;ALBs, NAT Gateways, Bastion hosts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Application&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Compute workloads&lt;/td&gt;
&lt;td&gt;Outbound via NAT&lt;/td&gt;
&lt;td&gt;ECS tasks, EC2 instances, Lambda&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Databases and storage&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;RDS, ElastiCache, OpenSearch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cache&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;In-memory data stores&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Redis clusters, Memcached&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ops and monitoring tools&lt;/td&gt;
&lt;td&gt;Outbound via NAT&lt;/td&gt;
&lt;td&gt;Jenkins, Prometheus, VPN endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each tier gets its own subnet in every availability zone, its own route table, and its own set of network ACLs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Defining the Network Layout
&lt;/h3&gt;

&lt;p&gt;Here's how the module structures the CIDR allocation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"github.com/kogunlowo123/terraform-aws-vpc-complete"&lt;/span&gt;

  &lt;span class="nx"&gt;vpc_name&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
  &lt;span class="nx"&gt;vpc_cidr&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.0.0/16"&lt;/span&gt;
  &lt;span class="nx"&gt;azs&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"eu-west-1b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"eu-west-1c"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="c1"&gt;# 5-tier subnet strategy&lt;/span&gt;
  &lt;span class="nx"&gt;public_subnets&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.1.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.2.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.3.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;application_subnets&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.10.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.11.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.12.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;data_subnets&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.20.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.21.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.22.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;cache_subnets&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.30.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.31.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.32.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;management_subnets&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.40.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.41.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.42.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="nx"&gt;enable_nat_gateway&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;single_nat_gateway&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;  &lt;span class="c1"&gt;# one per AZ for HA&lt;/span&gt;
  &lt;span class="nx"&gt;enable_vpn_gateway&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;enable_dns_hostnames&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;enable_dns_support&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
    &lt;span class="nx"&gt;ManagedBy&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform"&lt;/span&gt;
    &lt;span class="nx"&gt;Project&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"core-infrastructure"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key design decision here is the CIDR allocation. I leave gaps between tiers (&lt;code&gt;/24&lt;/code&gt; blocks starting at &lt;code&gt;.1&lt;/code&gt;, &lt;code&gt;.10&lt;/code&gt;, &lt;code&gt;.20&lt;/code&gt;, &lt;code&gt;.30&lt;/code&gt;, &lt;code&gt;.40&lt;/code&gt;) so there's room to add subnets later without re-addressing. In a &lt;code&gt;/16&lt;/code&gt; VPC, you have 65,536 addresses — don't be stingy with the spacing.&lt;/p&gt;

&lt;h2&gt;
  
  
  NAT Gateway Strategy: Cost vs. Availability
&lt;/h2&gt;

&lt;p&gt;NAT gateways are one of the most expensive networking components in AWS. Each one costs roughly $32/month plus data processing charges. With three AZs, that's nearly $100/month just for NAT — before any traffic flows.&lt;/p&gt;

&lt;p&gt;For production, I always recommend one NAT gateway per AZ:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# High-availability NAT configuration&lt;/span&gt;
&lt;span class="nx"&gt;enable_nat_gateway&lt;/span&gt;   &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;single_nat_gateway&lt;/span&gt;   &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="nx"&gt;one_nat_gateway_per_az&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For development and staging environments, a single NAT gateway is acceptable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Cost-optimized NAT for non-production&lt;/span&gt;
&lt;span class="nx"&gt;enable_nat_gateway&lt;/span&gt;   &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;single_nat_gateway&lt;/span&gt;   &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The module handles the route table associations automatically. Each private subnet's route table points to the NAT gateway in its own AZ, so if one AZ goes down, the other AZs continue to function.&lt;/p&gt;

&lt;h3&gt;
  
  
  Route Table Isolation
&lt;/h3&gt;

&lt;p&gt;Each subnet tier gets its own route table. This is critical — it means you can have application subnets that route through NAT while data subnets have no route to the internet at all:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_route_table"&lt;/span&gt; &lt;span class="s2"&gt;"data"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;count&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;azs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;vpc_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_vpc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;

  &lt;span class="c1"&gt;# No default route — these subnets are fully isolated&lt;/span&gt;
  &lt;span class="c1"&gt;# Only VPC-local traffic (10.0.0.0/16) is routable&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.vpc_name}-data-${var.azs[count.index]}"&lt;/span&gt;
    &lt;span class="nx"&gt;Tier&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"data"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No NAT gateway route. No internet gateway route. The database subnets literally cannot reach the internet, and the internet cannot reach them. This is defense in depth at the network layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  VPC Endpoints: Keeping Traffic Private and Cheap
&lt;/h2&gt;

&lt;p&gt;Every call to an AWS API from within your VPC traverses the NAT gateway by default. That means S3 downloads, CloudWatch metrics, ECR image pulls — all going out through NAT and incurring data processing charges.&lt;/p&gt;

&lt;p&gt;VPC endpoints solve this by creating private paths to AWS services:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Gateway endpoints (free)&lt;/span&gt;
&lt;span class="nx"&gt;enable_s3_endpoint&lt;/span&gt;       &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;enable_dynamodb_endpoint&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="c1"&gt;# Interface endpoints (cost per hour + per GB)&lt;/span&gt;
&lt;span class="nx"&gt;enable_interface_endpoints&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;interface_endpoints&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="s2"&gt;"ecr.api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"ecr.dkr"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"ecs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"ecs-telemetry"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"logs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"monitoring"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"ssm"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"ssmmessages"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"ec2messages"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"sts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"secretsmanager"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="s2"&gt;"kms"&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gateway endpoints&lt;/strong&gt; (S3 and DynamoDB) are free and should always be enabled. There's no reason not to.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Interface endpoints&lt;/strong&gt; cost about $7.20/month each (in &lt;code&gt;us-east-1&lt;/code&gt;), plus $0.01/GB of data processed. The math works out in your favor when you're pulling container images from ECR multiple times a day or shipping thousands of CloudWatch metrics per minute.&lt;/p&gt;

&lt;p&gt;A cluster pulling 50GB of container images monthly from ECR through NAT costs roughly $2.25 in NAT data processing. The ECR interface endpoints cost $14.40/month. At scale, though — say 500GB — the NAT cost is $22.50 and the endpoint cost is still $14.40. The crossover point depends on your traffic patterns, but the security benefit of keeping traffic off the public internet is worth the cost regardless.&lt;/p&gt;

&lt;h2&gt;
  
  
  VPC Flow Logs: Visibility Into Your Network
&lt;/h2&gt;

&lt;p&gt;Flow logs are non-negotiable in production. They capture metadata about every network flow — source, destination, ports, protocol, action (accept/reject), and byte counts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Flow log configuration&lt;/span&gt;
&lt;span class="nx"&gt;enable_flow_logs&lt;/span&gt;          &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;flow_log_destination_type&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cloud-watch-logs"&lt;/span&gt;
&lt;span class="nx"&gt;flow_log_retention_days&lt;/span&gt;   &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;

&lt;span class="nx"&gt;flow_log_cloudwatch_log_group&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"/vpc/production/flow-logs"&lt;/span&gt;

&lt;span class="nx"&gt;flow_log_traffic_type&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ALL"&lt;/span&gt;  &lt;span class="c1"&gt;# capture both accepted and rejected&lt;/span&gt;

&lt;span class="c1"&gt;# Custom log format for richer data&lt;/span&gt;
&lt;span class="nx"&gt;flow_log_format&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"$${version} $${account-id} $${interface-id} $${srcaddr} $${dstaddr} $${srcport} $${dstport} $${protocol} $${packets} $${bytes} $${start} $${end} $${action} $${log-status} $${vpc-id} $${subnet-id} $${az-id} $${sublocation-type} $${sublocation-id}"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I recommend shipping flow logs to CloudWatch Logs for real-time alerting and to S3 for long-term storage and analysis with Athena:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Dual destination for flow logs&lt;/span&gt;
&lt;span class="nx"&gt;enable_flow_logs_s3&lt;/span&gt;          &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="nx"&gt;flow_log_s3_bucket_arn&lt;/span&gt;       &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_s3_bucket&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flow_logs&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
&lt;span class="nx"&gt;flow_log_s3_key_prefix&lt;/span&gt;       &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"vpc-flow-logs/production"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Querying Flow Logs with Athena
&lt;/h3&gt;

&lt;p&gt;Once flow logs land in S3, you can create an Athena table and run SQL queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find rejected traffic to data subnets&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;srcaddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dstaddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dstport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;protocol&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;total_bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;flow_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;vpc_flow_logs&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'REJECT'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;dstaddr&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'10.0.2%'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026/03/06'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;srcaddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dstaddr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dstport&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;protocol&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;flow_count&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is how you catch misconfigured security groups, unexpected traffic patterns, and potential intrusion attempts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Network ACLs: The Second Layer
&lt;/h2&gt;

&lt;p&gt;Security groups are stateful and operate at the instance level. Network ACLs are stateless and operate at the subnet level. Using both gives you defense in depth:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Data tier NACLs - restrict to only application tier&lt;/span&gt;
&lt;span class="nx"&gt;data_subnet_nacl_rules&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;inbound&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;rule_number&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
      &lt;span class="nx"&gt;protocol&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"tcp"&lt;/span&gt;
      &lt;span class="nx"&gt;action&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"allow"&lt;/span&gt;
      &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.10.0/24"&lt;/span&gt;  &lt;span class="c1"&gt;# app subnet AZ-a&lt;/span&gt;
      &lt;span class="nx"&gt;from_port&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5432&lt;/span&gt;
      &lt;span class="nx"&gt;to_port&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5432&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;rule_number&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;101&lt;/span&gt;
      &lt;span class="nx"&gt;protocol&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"tcp"&lt;/span&gt;
      &lt;span class="nx"&gt;action&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"allow"&lt;/span&gt;
      &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.11.0/24"&lt;/span&gt;  &lt;span class="c1"&gt;# app subnet AZ-b&lt;/span&gt;
      &lt;span class="nx"&gt;from_port&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5432&lt;/span&gt;
      &lt;span class="nx"&gt;to_port&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5432&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;rule_number&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;102&lt;/span&gt;
      &lt;span class="nx"&gt;protocol&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"tcp"&lt;/span&gt;
      &lt;span class="nx"&gt;action&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"allow"&lt;/span&gt;
      &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.12.0/24"&lt;/span&gt;  &lt;span class="c1"&gt;# app subnet AZ-c&lt;/span&gt;
      &lt;span class="nx"&gt;from_port&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5432&lt;/span&gt;
      &lt;span class="nx"&gt;to_port&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5432&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;outbound&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;rule_number&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
      &lt;span class="nx"&gt;protocol&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"tcp"&lt;/span&gt;
      &lt;span class="nx"&gt;action&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"allow"&lt;/span&gt;
      &lt;span class="nx"&gt;cidr_block&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.10.0/0"&lt;/span&gt;
      &lt;span class="nx"&gt;from_port&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;
      &lt;span class="nx"&gt;to_port&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;65535&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures that even if someone misconfigures a security group, the database subnets only accept PostgreSQL traffic from the application tier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tagging Strategy for Cost Allocation
&lt;/h2&gt;

&lt;p&gt;Every subnet, route table, NAT gateway, and endpoint gets tagged consistently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;subnet_tags&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"kubernetes.io/cluster/production"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"shared"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;public_subnet_tags&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"kubernetes.io/role/elb"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1"&lt;/span&gt;
  &lt;span class="nx"&gt;Tier&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"public"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;application_subnet_tags&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"kubernetes.io/role/internal-elb"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1"&lt;/span&gt;
  &lt;span class="nx"&gt;Tier&lt;/span&gt;                              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"application"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These tags serve double duty: they enable Kubernetes auto-discovery of subnets for load balancers, and they allow cost allocation reports to break down networking costs by tier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting It All Together
&lt;/h2&gt;

&lt;p&gt;Here's a complete example that ties everything together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"vpc"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"github.com/kogunlowo123/terraform-aws-vpc-complete"&lt;/span&gt;

  &lt;span class="nx"&gt;vpc_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
  &lt;span class="nx"&gt;vpc_cidr&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"10.0.0.0/16"&lt;/span&gt;
  &lt;span class="nx"&gt;azs&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"eu-west-1a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"eu-west-1b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"eu-west-1c"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="c1"&gt;# Subnet tiers&lt;/span&gt;
  &lt;span class="nx"&gt;public_subnets&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.1.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.2.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.3.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;application_subnets&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.10.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.11.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.12.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;data_subnets&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.20.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.21.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.22.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;cache_subnets&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.30.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.31.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.32.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;management_subnets&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"10.0.40.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.41.0/24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"10.0.42.0/24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="c1"&gt;# NAT&lt;/span&gt;
  &lt;span class="nx"&gt;enable_nat_gateway&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;single_nat_gateway&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="nx"&gt;one_nat_gateway_per_az&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="c1"&gt;# VPC Endpoints&lt;/span&gt;
  &lt;span class="nx"&gt;enable_s3_endpoint&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;enable_dynamodb_endpoint&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;enable_interface_endpoints&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;interface_endpoints&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"ecr.api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"ecr.dkr"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"logs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"monitoring"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"ssm"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"sts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"kms"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="c1"&gt;# Flow Logs&lt;/span&gt;
  &lt;span class="nx"&gt;enable_flow_logs&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;flow_log_traffic_type&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ALL"&lt;/span&gt;
  &lt;span class="nx"&gt;flow_log_retention_days&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;

  &lt;span class="c1"&gt;# DNS&lt;/span&gt;
  &lt;span class="nx"&gt;enable_dns_hostnames&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;enable_dns_support&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;
    &lt;span class="nx"&gt;ManagedBy&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;p&gt;After deploying this pattern across dozens of AWS accounts, here are the things that consistently trip teams up:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Don't use /16 for everything.&lt;/strong&gt; If you ever need VPC peering or Transit Gateway, overlapping CIDRs will block you. Plan your IP address space across all accounts upfront.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable flow logs from day one.&lt;/strong&gt; Retroactively enabling them means you have no baseline to compare against when something goes wrong.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Test failover.&lt;/strong&gt; Kill a NAT gateway and verify that the AZ-local route table doesn't send traffic to a NAT in another AZ. If it does, your route table associations are wrong.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Review VPC endpoint policies.&lt;/strong&gt; An S3 gateway endpoint with a default policy allows access to every S3 bucket in every AWS account. Lock it down to your buckets.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monitor NAT gateway metrics.&lt;/strong&gt; &lt;code&gt;BytesOutToDestination&lt;/code&gt; and &lt;code&gt;PacketsDropCount&lt;/code&gt; will tell you when you're approaching throughput limits (45 Gbps per gateway).&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;A production VPC isn't just about creating subnets. It's about building a network foundation that enforces security boundaries, optimizes costs, and provides the visibility you need when things go wrong at 3 AM.&lt;/p&gt;

&lt;p&gt;The module I've walked through here is available on GitHub. Fork it, adapt it, and build on it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/kogunlowo123/terraform-aws-vpc-complete" rel="noopener noreferrer"&gt;terraform-aws-vpc-complete&lt;/a&gt; — The full VPC module with all configurations discussed in this article&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check out my other infrastructure modules at &lt;a href="https://github.com/kogunlowo123" rel="noopener noreferrer"&gt;github.com/kogunlowo123&lt;/a&gt; for complementary resources covering WAF, AKS, and multi-cloud landing zones.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have questions about VPC design or want to share your own patterns? Drop a comment below — I'd love to hear how other teams approach this.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
    </item>
  </channel>
</rss>
