<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: kaustubh yerkade</title>
    <description>The latest articles on Forem by kaustubh yerkade (@kaustubhyerkade).</description>
    <link>https://forem.com/kaustubhyerkade</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F398020%2F63dd7dd2-7ee9-491c-be43-a2e117397766.jpg</url>
      <title>Forem: kaustubh yerkade</title>
      <link>https://forem.com/kaustubhyerkade</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/kaustubhyerkade"/>
    <language>en</language>
    <item>
      <title>I Built an AI-Powered Infrastructure Documentation Generator Using GitHub Copilot CLI</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Sun, 15 Feb 2026 17:27:05 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/i-built-an-ai-powered-infrastructure-documentation-generator-using-github-copilot-cli-5cmb</link>
      <guid>https://forem.com/kaustubhyerkade/i-built-an-ai-powered-infrastructure-documentation-generator-using-github-copilot-cli-5cmb</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/github-2026-01-21"&gt;GitHub Copilot CLI Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;As a DevOps engineer, one recurring pain point I’ve experienced is infrastructure documentation.&lt;/p&gt;

&lt;p&gt;We write Terraform.&lt;br&gt;
We deploy Kubernetes.&lt;br&gt;
We build Docker images.&lt;/p&gt;

&lt;p&gt;But documentation? It often gets delayed or skipped.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;AI-Powered Infra Doc Generator.&lt;/strong&gt; A CLI tool that turns Infrastructure as Code into structured, enterprise grade documentation using GitHub Copilot CLI.&lt;/p&gt;

&lt;p&gt;This tool:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reads Terraform / YAML / Dockerfiles
&lt;/li&gt;
&lt;li&gt;Sends structured prompts to GitHub Copilot CLI
&lt;/li&gt;
&lt;li&gt;Generates:

&lt;ul&gt;
&lt;li&gt;Architecture Overview
&lt;/li&gt;
&lt;li&gt;Component Breakdown
&lt;/li&gt;
&lt;li&gt;Security Risk Analysis
&lt;/li&gt;
&lt;li&gt;Cost Optimization Suggestions
&lt;/li&gt;
&lt;li&gt;Mermaid Architecture Diagram
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All automatically in seconds. Instead of manually writing documentation, infrastructure can now explain itself.&lt;/p&gt;
&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;



&lt;p&gt;Run the CLI-&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$infra-doc-gen main.tf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tool creates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INFRA_DOC.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Architecture Overview

This Terraform configuration provisions:
- AWS EC2 instance
- Application Load Balancer
- Security Group
- RDS Database

## Security Risks
- Security group allows 0.0.0.0/0 on port 22
- RDS encryption not enabled
- No IAM least-privilege enforcement

## Cost Optimization
- Use smaller instance types for non-production
- Enable autoscaling
- Consider Reserved Instances

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Screenshots- &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwlmvq5m7evqxu4kfucny.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwlmvq5m7evqxu4kfucny.png" alt=" " width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz4igbl0o45uz2b7hifgr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz4igbl0o45uz2b7hifgr.png" alt=" " width="759" height="1029"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv8xopvcx1b7z3e5m4iv9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv8xopvcx1b7z3e5m4iv9.png" alt=" " width="800" height="215"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It also generates a Mermaid diagram:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
User --&amp;gt; ALB
ALB --&amp;gt; EC2
EC2 --&amp;gt; RDS
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub Repo: &lt;a href="https://github.com/kaustubhyerkade/infra_doc_gen" rel="noopener noreferrer"&gt;https://github.com/kaustubhyerkade/infra_doc_gen&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  My Experience with GitHub Copilot CLI
&lt;/h2&gt;

&lt;p&gt;GitHub Copilot CLI completely changed how I approached this project.&lt;/p&gt;

&lt;p&gt;Instead of building API wrappers or integrating SDKs, I could:&lt;/p&gt;

&lt;p&gt;Work directly inside the terminal&lt;br&gt;
Use natural language prompts&lt;br&gt;
Iterate rapidly on structured AI responses&lt;br&gt;
The biggest learning for me was prompt engineering.&lt;/p&gt;

&lt;p&gt;Instead of asking:&lt;br&gt;
“Explain this Terraform file”&lt;/p&gt;

&lt;p&gt;I used structured instructions like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are a senior cloud architect.

Analyze the following infrastructure code and generate:

1. Architecture Overview
2. Key Components
3. Security Risks
4. Cost Optimization Suggestions
5. Mermaid diagram
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This dramatically improved the quality and structure of the output.&lt;br&gt;
It genuinely felt like pairing with a Cloud Architect inside my terminal.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
      <category>cli</category>
      <category>githubcopilot</category>
    </item>
    <item>
      <title>Linux Internals Everyone *Must* Understand</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Sun, 18 Jan 2026 19:29:42 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/linux-internals-everyone-must-understand-526l</link>
      <guid>https://forem.com/kaustubhyerkade/linux-internals-everyone-must-understand-526l</guid>
      <description>&lt;h1&gt;
  
  
  Linux Internals Every DevOps Engineer &lt;em&gt;Must&lt;/em&gt; Understand
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;(Let's go Beyond “I know Linux”)&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you claim &lt;strong&gt;DevOps&lt;/strong&gt;, Linux isn’t just an OS. it’s your &lt;strong&gt;runtime, debugger, firewall, scheduler, and autopsy report&lt;/strong&gt;.&lt;br&gt;
This article covers the &lt;strong&gt;Linux internals that secretly gets you tested.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  1. Linux File System Internals: &lt;code&gt;/proc&lt;/code&gt; &amp;amp; &lt;code&gt;/sys&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fttlkcg9uc766v60r5s7c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fttlkcg9uc766v60r5s7c.png" alt="Image" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zvkvbgxqr0j6ckrqaua.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zvkvbgxqr0j6ckrqaua.png" alt="Image" width="558" height="813"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhis4u9zku9gll84gd3x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjhis4u9zku9gll84gd3x.png" alt="Image" width="479" height="312"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Linux exposes its &lt;strong&gt;own brain&lt;/strong&gt; as files.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;/proc&lt;/code&gt; – Process &amp;amp; Kernel Runtime View
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Virtual filesystem (no disk I/O)&lt;/li&gt;
&lt;li&gt;Created at boot&lt;/li&gt;
&lt;li&gt;Reflects &lt;strong&gt;current kernel state&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Key paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/proc/cpuinfo      &lt;span class="c"&gt;# CPU architecture, cores&lt;/span&gt;
/proc/meminfo      &lt;span class="c"&gt;# Memory stats&lt;/span&gt;
/proc/loadavg      &lt;span class="c"&gt;# Load average&lt;/span&gt;
/proc/&amp;lt;PID&amp;gt;/fd     &lt;span class="c"&gt;# Open file descriptors&lt;/span&gt;
/proc/&amp;lt;PID&amp;gt;/maps   &lt;span class="c"&gt;# Memory mapping&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Production insight&lt;/strong&gt;&lt;br&gt;
If a Java app is leaking memory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt; /proc/&amp;lt;PID&amp;gt;/fd | &lt;span class="nb"&gt;wc&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’ll instantly know if file descriptors are leaking.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;code&gt;/sys&lt;/code&gt; – Hardware &amp;amp; Driver Control Layer
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Used by &lt;strong&gt;udev&lt;/strong&gt;, drivers, containers&lt;/li&gt;
&lt;li&gt;Allows &lt;em&gt;controlled kernel interaction&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/sys/class/net/eth0/speed
/sys/block/sda/queue/scheduler
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Great takeaway&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/proc&lt;/code&gt; = &lt;em&gt;What is happening now&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/sys&lt;/code&gt; = &lt;em&gt;How hardware &amp;amp; kernel are wired&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. Process Lifecycle: fork → exec → zombie
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5zpco8owxki80no3ul3.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn5zpco8owxki80no3ul3.webp" alt="Image" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6q2ijktai8fv2nz33vgh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6q2ijktai8fv2nz33vgh.png" alt="Image" width="577" height="323"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqsqynxia7vdbuddre7qi.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqsqynxia7vdbuddre7qi.jpeg" alt="Image" width="700" height="359"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Understanding processes separates juniors from seniors.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Lifecycle
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;fork()&lt;/strong&gt; → child process created&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;exec()&lt;/strong&gt; → program replaced&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;wait()&lt;/strong&gt; → parent collects exit status
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;fork&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/bin/java"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Zombies (are Not Horror, Just Bad Parenting)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Process finished execution&lt;/li&gt;
&lt;li&gt;Parent &lt;strong&gt;did not collect exit code&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;PID still exists&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ps aux | &lt;span class="nb"&gt;grep &lt;/span&gt;Z
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Restart parent&lt;/li&gt;
&lt;li&gt;Or fix application signal handling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Interview gold line&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Zombies don’t consume memory, but they exhaust PID space.”&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  3. Memory, CPU &amp;amp; Load Average (The Most Misunderstood Topic)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb7z1rvmswdrefu35tmvp.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb7z1rvmswdrefu35tmvp.webp" alt="Image" width="800" height="625"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx6340sqvs8s2w3tvinmz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx6340sqvs8s2w3tvinmz.png" alt="Image" width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgakwf3pqodwqwyh3nt2a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgakwf3pqodwqwyh3nt2a.png" alt="Image" width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Average ≠ CPU Usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;uptime&lt;/span&gt;
&lt;span class="c"&gt;# 1.2 0.9 0.7&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Avg runnable or waiting processes over &lt;strong&gt;1, 5, 15 minutes&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Load = 4 on 4 cores&lt;/td&gt;
&lt;td&gt;Healthy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Load = 10 on 4 cores&lt;/td&gt;
&lt;td&gt;Overloaded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High load, low CPU&lt;/td&gt;
&lt;td&gt;I/O bottleneck&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  Memory: Why “Free” Lies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;free &lt;span class="nt"&gt;-m&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Focus on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;available&lt;/strong&gt;, not free&lt;/li&gt;
&lt;li&gt;Linux aggressively uses cache&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Clear myth:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“High memory usage is bad” ❌&lt;br&gt;
“Unused memory is wasted memory” ✅&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  Real Debug Workflow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vmstat 1
iostat &lt;span class="nt"&gt;-x&lt;/span&gt;
top / htop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Can correlate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU wait&lt;/li&gt;
&lt;li&gt;Disk latency&lt;/li&gt;
&lt;li&gt;Run queue&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4. Networking Basics: Ports, Sockets &amp;amp; Reality
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzww3jf1nactit3uyr6j9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzww3jf1nactit3uyr6j9.png" alt="Image" width="549" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cece5r3ghvg8p510iiu.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cece5r3ghvg8p510iiu.jpg" alt="Image" width="800" height="565"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1z3qxpocgqooadttiym.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1z3qxpocgqooadttiym.png" alt="Image" width="800" height="516"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Port ≠ Process
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;socket = IP + Port + Protocol&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ss &lt;span class="nt"&gt;-tulnp&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LISTEN 0 128 0.0.0.0:8080 java
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Connection States You Must Know
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;State&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LISTEN&lt;/td&gt;
&lt;td&gt;Waiting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESTABLISHED&lt;/td&gt;
&lt;td&gt;Active&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TIME_WAIT&lt;/td&gt;
&lt;td&gt;Normal close&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CLOSE_WAIT&lt;/td&gt;
&lt;td&gt;App bug&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;A signal&lt;/strong&gt;&lt;br&gt;
If you see many &lt;code&gt;CLOSE_WAIT&lt;/code&gt; → application is leaking connections.&lt;/p&gt;


&lt;h2&gt;
  
  
  5. Permissions &amp;amp; SELinux (Where “It Works on My VM” Dies)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc2abogvieqaijmk5uuzb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc2abogvieqaijmk5uuzb.png" alt="Image" width="660" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4nehru2s7du5t8wy2tc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4nehru2s7du5t8wy2tc.png" alt="Image" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9lfk6v634fpcpmj7e5ea.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9lfk6v634fpcpmj7e5ea.png" alt="Image" width="800" height="567"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Linux Permissions Refresher
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nt"&gt;-rwxr-x---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;But permissions alone are &lt;strong&gt;not enough&lt;/strong&gt;.&lt;/p&gt;


&lt;h3&gt;
  
  
  SELinux (Mandatory Access Control)
&lt;/h3&gt;

&lt;p&gt;Modes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;getenforce
&lt;span class="c"&gt;# Enforcing | Permissive | Disabled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why prod apps fail:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Correct permissions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Wrong SELinux context&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fix properly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ausearch &lt;span class="nt"&gt;-m&lt;/span&gt; avc &lt;span class="nt"&gt;-ts&lt;/span&gt; recent
semanage fcontext
restorecon
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Senior rule&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Never disable SELinux in production, fix policies.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  6. systemd: The Real Init System
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmwxde4u2jyl3eul3blj3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmwxde4u2jyl3eul3blj3.png" alt="Image" width="675" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzef3ai2jv8gx8e21flql.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzef3ai2jv8gx8e21flql.jpg" alt="Image" width="738" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft0y2zfhkku4n0gms5o65.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft0y2zfhkku4n0gms5o65.png" alt="Image" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  systemd Is More Than “service start”
&lt;/h3&gt;

&lt;p&gt;It handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Process supervision&lt;/li&gt;
&lt;li&gt;Logging&lt;/li&gt;
&lt;li&gt;Dependency management&lt;/li&gt;
&lt;li&gt;Auto-restart&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example unit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/app/start.sh&lt;/span&gt;
&lt;span class="py"&gt;Restart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;always&lt;/span&gt;
&lt;span class="py"&gt;MemoryMax&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;2G&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check failures:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; myapp &lt;span class="nt"&gt;--since&lt;/span&gt; today
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Why DevOps Love systemd
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Built-in watchdog&lt;/li&gt;
&lt;li&gt;CGroup resource limits&lt;/li&gt;
&lt;li&gt;Deterministic startup&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How they Evaluate This Knowledge
&lt;/h2&gt;

&lt;p&gt;They won’t ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Explain /proc”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They’ll ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Why is load high but CPU idle?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Or:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“App restarted but port still busy”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you understand internals, answers come naturally.&lt;/p&gt;




&lt;p&gt;Linux is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your &lt;strong&gt;observability platform&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Your &lt;strong&gt;runtime security layer&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Your &lt;strong&gt;truth source&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Tools change.&lt;br&gt;
Containers evolve.&lt;br&gt;
&lt;strong&gt;Linux fundamentals compound forever.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>linux</category>
      <category>devops</category>
    </item>
    <item>
      <title>The "Aha!" Moment: Why Understanding the JVM Changed How I Write Java</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Thu, 01 Jan 2026 10:27:07 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/the-aha-moment-why-understanding-the-jvm-changed-how-i-write-java-4hef</link>
      <guid>https://forem.com/kaustubhyerkade/the-aha-moment-why-understanding-the-jvm-changed-how-i-write-java-4hef</guid>
      <description>&lt;h1&gt;
  
  
  I Used Java for Years,  Then I Finally Understood the JVM
&lt;/h1&gt;

&lt;p&gt;I’ve scaled Java services, increased heap sizes and blamed GC for multiple times.&lt;/p&gt;

&lt;p&gt;And for a long time, I still didn’t understand what the JVM was &lt;em&gt;really&lt;/em&gt; doing.&lt;/p&gt;

&lt;p&gt;If you’ve ever:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Given a service &lt;strong&gt;more RAM&lt;/strong&gt; and watched latency get worse&lt;/li&gt;
&lt;li&gt;Seen &lt;strong&gt;CPU spike&lt;/strong&gt; while traffic was low&lt;/li&gt;
&lt;li&gt;Hit &lt;code&gt;OutOfMemoryError&lt;/code&gt; on a machine with free memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This post is for you.&lt;/p&gt;

&lt;p&gt;This isn’t a JVM reference guide. This is something I wish I had before debugging JVM issues in production.&lt;/p&gt;




&lt;h2&gt;
  
  
  The JVM Is Not “Java Runtime”
&lt;/h2&gt;

&lt;h3&gt;
  
  
  It’s a whole Runtime Operating System
&lt;/h3&gt;

&lt;p&gt;Most people think of the JVM as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“The thing that runs Java code.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;and I think that’s dangerously incomplete.&lt;/p&gt;

&lt;p&gt;The JVM is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;memory manager&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;thread scheduler&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;Just-In-Time (JIT) compiler&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;runtime optimizer&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;portable execution platform&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice, the JVM behaves more like a &lt;strong&gt;mini operating system inside your OS&lt;/strong&gt;. (Thats why its call Java Virtaul Machine)&lt;/p&gt;

&lt;p&gt;Once you see it that way, a lot of “mysterious” behavior suddenly makes sense.&lt;/p&gt;




&lt;h2&gt;
  
  
  From Code to CPU: What Actually Happens
&lt;/h2&gt;

&lt;p&gt;Here’s the real journey of your Java code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.java → bytecode (.class) → interpreted → JIT compiled → machine code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key insight:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Your Java code does &lt;strong&gt;not&lt;/strong&gt; run the same way for its entire lifetime.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At startup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code is interpreted&lt;/li&gt;
&lt;li&gt;It’s slow but flexible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Over time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hot code paths are detected&lt;/li&gt;
&lt;li&gt;JIT compiles them into optimized native code&lt;/li&gt;
&lt;li&gt;Optimizations are speculative and reversible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First requests are slow&lt;/li&gt;
&lt;li&gt;Benchmarks lie&lt;/li&gt;
&lt;li&gt;Restarting a JVM “fixes” performance (temporarily)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ClassLoaders: The Hidden Source of Insanity
&lt;/h2&gt;

&lt;p&gt;ClassLoaders aren’t just about loading classes.&lt;br&gt;
They define &lt;strong&gt;identity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Two classes with the same name:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Loaded by different ClassLoaders&lt;/li&gt;
&lt;li&gt;Are &lt;strong&gt;not the same class&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This explains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ClassCastException&lt;/code&gt; that makes no sense&lt;/li&gt;
&lt;li&gt;Plugin architecture bugs&lt;/li&gt;
&lt;li&gt;Dependency conflicts in fat JARs&lt;/li&gt;
&lt;li&gt;Spring Boot classpath nightmares&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So Basically&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;ClassLoaders are namespaces, not folders.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once a class is loaded, unloading it is complicated.&lt;br&gt;
And sometimes impossible.&lt;/p&gt;




&lt;h2&gt;
  
  
  JVM Memory: Why “Just Increase Heap” Is Bad Advice
&lt;/h2&gt;

&lt;p&gt;This is where most JVM myths live.&lt;/p&gt;

&lt;h3&gt;
  
  
  JVM Memory ≠ Heap
&lt;/h3&gt;

&lt;p&gt;The JVM uses &lt;strong&gt;more memory than you think&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Heap (Young + Old)&lt;/li&gt;
&lt;li&gt;Metaspace (class metadata)&lt;/li&gt;
&lt;li&gt;Thread stacks&lt;/li&gt;
&lt;li&gt;Native memory&lt;/li&gt;
&lt;li&gt;Direct buffers&lt;/li&gt;
&lt;li&gt;Code cache&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Important truth:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;-Xmx&lt;/code&gt; limits the heap &lt;strong&gt;not the JVM&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Containers OOM-kill Java apps&lt;/li&gt;
&lt;li&gt;Kubernetes limits feel “ignored”&lt;/li&gt;
&lt;li&gt;Metaspace OOMs happen unexpectedly&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why Bigger Heap Can Make Things Worse
&lt;/h3&gt;

&lt;p&gt;A larger heap:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Means &lt;strong&gt;longer GC cycles&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Increases pause time risk&lt;/li&gt;
&lt;li&gt;Can hide memory leaks until it’s too late&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In production:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A stable, smaller heap often performs better than a massive one.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Garbage Collection: Not About Speed but About Predictability
&lt;/h2&gt;

&lt;p&gt;Garbage Collection exists because:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Manual memory management does not scale for humans.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But GC is not free.&lt;/p&gt;

&lt;p&gt;Every GC strategy is a tradeoff between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Throughput&lt;/li&gt;
&lt;li&gt;Latency&lt;/li&gt;
&lt;li&gt;Memory footprint&lt;/li&gt;
&lt;li&gt;Predictability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Modern collectors (G1, ZGC, Shenandoah) optimize for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shorter pauses&lt;/li&gt;
&lt;li&gt;More consistent latency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Key realization:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;GC tuning is about &lt;strong&gt;controlling pauses&lt;/strong&gt;, not eliminating them.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And yes &lt;strong&gt;Stop The World(STW) still exists&lt;/strong&gt;.&lt;br&gt;
It’s just shorter and smarter now.&lt;/p&gt;




&lt;h2&gt;
  
  
  JIT Compilation: Your Code Changes While Running
&lt;/h2&gt;

&lt;p&gt;This part blows minds.&lt;/p&gt;

&lt;p&gt;The JVM:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Profiles your code&lt;/li&gt;
&lt;li&gt;Assumes patterns&lt;/li&gt;
&lt;li&gt;Optimizes aggressively&lt;/li&gt;
&lt;li&gt;Deoptimizes when assumptions break&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This explains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why traffic shape matters&lt;/li&gt;
&lt;li&gt;Why long running services behave differently&lt;/li&gt;
&lt;li&gt;Why redeploying can reset performance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The JVM is constantly asking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Is this still the fastest way to run this code?”&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Threads, CPU, and the Illusion of Parallelism
&lt;/h2&gt;

&lt;p&gt;Java threads map to &lt;strong&gt;OS threads&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Which means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Context switching is expensive&lt;/li&gt;
&lt;li&gt;More threads ≠ more throughput&lt;/li&gt;
&lt;li&gt;Blocking IO kills scalability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU spikes on “idle” systems&lt;/li&gt;
&lt;li&gt;Thread dumps reveal chaos&lt;/li&gt;
&lt;li&gt;Virtual threads exist (finally)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SO Behind the scenes:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Threads compete. They don’t cooperate.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Why JVM Apps sometimes Fail in Production ?
&lt;/h2&gt;

&lt;p&gt;Most JVM outages are not caused by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Java being slow&lt;/li&gt;
&lt;li&gt;The GC being broken&lt;/li&gt;
&lt;li&gt;The JVM being bad !&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They’re caused by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wrong thinking&lt;/li&gt;
&lt;li&gt;Blind tuning&lt;/li&gt;
&lt;li&gt;Ignoring runtime behavior&lt;/li&gt;
&lt;li&gt;Treating JVM like a black box&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common mistakes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Increasing heap without GC analysis&lt;/li&gt;
&lt;li&gt;Ignoring native memory&lt;/li&gt;
&lt;li&gt;No GC logs&lt;/li&gt;
&lt;li&gt;No understanding of startup vs steady state&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The JVM Debugging Toolkit (Minimal, Powerful)
&lt;/h2&gt;

&lt;p&gt;You don’t need fancy tools to understand the JVM.&lt;br&gt;
Just learning these would be sufficient:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;jcmd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;jstack&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;jmap&lt;/code&gt;&lt;/li&gt;
&lt;li&gt; GC logs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most production mysteries can be explained with these tools alone.&lt;/p&gt;




&lt;h2&gt;
  
  
  The JVM (If You want Remember Only One Thing)
&lt;/h2&gt;

&lt;p&gt;Then Remember this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The JVM is a &lt;strong&gt;self-optimizing runtime&lt;/strong&gt; that trades predictability for performance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once you accept that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Java feels less random&lt;/li&gt;
&lt;li&gt;Performance issues feel explainable&lt;/li&gt;
&lt;li&gt;Production debugging gets calmer&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Java doesn’t run &lt;em&gt;on&lt;/em&gt; your machine.&lt;/p&gt;

&lt;p&gt;It negotiates with it.&lt;br&gt;
Continuously.&lt;br&gt;
Aggressively.&lt;/p&gt;

&lt;p&gt;And once you understand the JVM, Java stops feeling slow&lt;br&gt;
and starts feeling &lt;strong&gt;honest&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  A 5 Minute GC Log Walkthrough (That Actually Helps)
&lt;/h2&gt;

&lt;p&gt;GC logs look scary until you know &lt;strong&gt;what to look for&lt;/strong&gt;.&lt;br&gt;
Let’s decode the &lt;em&gt;only things that matter&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Enable GC Logging (Modern JVM)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-Xlog:gc*:stdout:time,level,tags
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When GC happened&lt;/li&gt;
&lt;li&gt;Which collector ran&lt;/li&gt;
&lt;li&gt;How long the pause was&lt;/li&gt;
&lt;li&gt;How much memory was reclaimed&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2. A Real GC Log Line (Simplified)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[3.421s][info][gc] GC(12) Pause Young (Normal) 512M-&amp;gt;128M(2048M) 45.6ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. How to Read It (Left → Right)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;3.421s&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
→ Time since JVM start&lt;br&gt;
If this is early, you’re still warming up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;GC(12)&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
→ 12th GC cycle&lt;br&gt;
High frequency = allocation pressure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;Pause Young&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
→ Minor GC&lt;br&gt;
Usually healthy. Frequent is okay &lt;em&gt;until latency matters&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;512M-&amp;gt;128M(2048M)&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
→ Before → After (Heap size)&lt;br&gt;
If “after” keeps growing, you’re promoting objects too fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;45.6ms&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
→ Stop-the-world pause&lt;br&gt;
This is the number users feel.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. What Healthy GC Looks Like
&lt;/h3&gt;

&lt;p&gt;✔ Young GC&lt;br&gt;
✔ Short pauses (&amp;lt;50ms)&lt;br&gt;
✔ Old Gen stays mostly flat&lt;br&gt;
✔ No Full GC during normal load&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Red Flags to Watch For
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pause &amp;gt; 200ms&lt;/strong&gt; → Latency spikes&lt;br&gt;
&lt;strong&gt;Old Gen always growing&lt;/strong&gt; → Memory leak or bad object lifetime&lt;br&gt;
&lt;strong&gt;Frequent Full GC&lt;/strong&gt; → Heap too small or wrong GC&lt;br&gt;
&lt;strong&gt;GC during low traffic&lt;/strong&gt; → Memory fragmentation&lt;/p&gt;

&lt;h3&gt;
  
  
  6. One Golden Rule
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;GC logs tell a story. Don’t read them line by line look for patterns.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Most JVM issues are visible &lt;strong&gt;10 lines into the log&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  7. When GC Logs Save Production
&lt;/h3&gt;

&lt;p&gt;If you ever think:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“CPU is high but traffic is low”&lt;/li&gt;
&lt;li&gt;“Latency spikes every few minutes”&lt;/li&gt;
&lt;li&gt;“Heap increase didn’t help”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;GC logs will usually explain &lt;strong&gt;why&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;If you liked this post , Just let me in the comments.&lt;/p&gt;

</description>
      <category>jvm</category>
      <category>java</category>
      <category>springboot</category>
      <category>spring</category>
    </item>
    <item>
      <title>Building a Global Scale App in 2026: Treating Zero Latency as a Lie (Here’s What Actually Works)</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Thu, 25 Dec 2025 20:44:25 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/building-a-global-app-in-2026-zero-latency-is-a-lie-heres-what-actually-works-1m5m</link>
      <guid>https://forem.com/kaustubhyerkade/building-a-global-app-in-2026-zero-latency-is-a-lie-heres-what-actually-works-1m5m</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Zero latency is just latency we haven’t measured yet.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every few months, someone asks:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;How do I build an app that works worldwide with zero latency and minimum cost?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Short answer: &lt;strong&gt;You can’t.&lt;/strong&gt;&lt;br&gt;
Long answer: &lt;strong&gt;You don’t need to.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In 2026, the best global applications don’t beat physics but&lt;br&gt;
they &lt;strong&gt;outsmart perception, architecture, and failure modes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This article explains &lt;strong&gt;how teams actually build “instant” global apps&lt;/strong&gt; without burning money or their on call engineers.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Hard Truth (Let’s Get This Out of the Way)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The speed of light exists&lt;/li&gt;
&lt;li&gt;Packets still travel through oceans&lt;/li&gt;
&lt;li&gt;Users live far away from your servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If someone promises &lt;em&gt;zero latency worldwide&lt;/em&gt;, they are selling slides, not systems.&lt;/p&gt;

&lt;p&gt;But here’s the twist&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Users don’t care about latency.&lt;br&gt;
They care about responsiveness.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s the game in 2026.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 2026 Model
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Move code to users, not users to code.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Cache everything you legally can.&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Fail locally, not globally.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you remember only this, you’ll already outperform most systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  High-Level Architecture (2026 Edition)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Three layers. No more.
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Edge (closest to users)&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Regional&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Core (rarely touched)&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If a request reaches your core backend…&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Congratulations, you just paid extra for latency.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  1. Frontend: Static Is the New Fast
&lt;/h2&gt;

&lt;p&gt;In 2026, the fastest frontend is still the simplest one.&lt;/p&gt;

&lt;h3&gt;
  
  
  What winning teams do:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Static first rendering&lt;/li&gt;
&lt;li&gt;Aggressive CDN usage&lt;/li&gt;
&lt;li&gt;Minimal JavaScript on first load&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why?
&lt;/h3&gt;

&lt;p&gt;Static content served from the edge is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cheap&lt;/li&gt;
&lt;li&gt;Predictable&lt;/li&gt;
&lt;li&gt;Globally fast&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“The fastest query is the one you never run.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  2. Edge Compute: The Real Backend
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Edge compute: because the backend was too far away.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Edge compute isn’t a toy anymore.&lt;br&gt;
It’s where &lt;strong&gt;real logic lives&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What belongs at the edge:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Authentication checks&lt;/li&gt;
&lt;li&gt;Feature flags&lt;/li&gt;
&lt;li&gt;A/B testing&lt;/li&gt;
&lt;li&gt;Request routing&lt;/li&gt;
&lt;li&gt;Lightweight APIs&lt;/li&gt;
&lt;li&gt;Response shaping&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What doesn’t:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Heavy transactions&lt;/li&gt;
&lt;li&gt;Long running jobs&lt;/li&gt;
&lt;li&gt;Strongly consistent workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“If your edge function needs a database, you missed the point.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Edge code must be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stateless&lt;/li&gt;
&lt;li&gt;Fast&lt;/li&gt;
&lt;li&gt;Disposable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stateless code travels better.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Data Strategy: Latency Lives Here
&lt;/h2&gt;

&lt;p&gt;This is where most global apps fail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Common mistake
&lt;/h3&gt;

&lt;p&gt;One global database. One source of truth. One giant bottleneck.&lt;/p&gt;

&lt;h3&gt;
  
  
  The cool 2026 approach: &lt;strong&gt;Data by behavior&lt;/strong&gt;
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Data Type&lt;/th&gt;
&lt;th&gt;Strategy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sessions&lt;/td&gt;
&lt;td&gt;Edge KV&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User profiles&lt;/td&gt;
&lt;td&gt;Geo-replicated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Feeds&lt;/td&gt;
&lt;td&gt;Precomputed + cached&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Analytics&lt;/td&gt;
&lt;td&gt;Async events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Payments&lt;/td&gt;
&lt;td&gt;Regional strong consistency&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Strong consistency is great until customers want speed.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not all data deserves the same guarantees.&lt;br&gt;
Treat it accordingly.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. APIs: Fewer, Bigger, Lazier
&lt;/h2&gt;

&lt;p&gt;2026 APIs are boring and that’s a compliment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Modern API rules:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Fewer round trips&lt;/li&gt;
&lt;li&gt;Larger responses&lt;/li&gt;
&lt;li&gt;Cacheable by default&lt;/li&gt;
&lt;li&gt;Idempotent writes&lt;/li&gt;
&lt;li&gt;Async side effects&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Every API call is a tax on latency.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If your frontend makes 12 calls to render a page —&lt;br&gt;
the problem isn’t the network.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Caching: Your #1 Performance Tool
&lt;/h2&gt;

&lt;p&gt;Caching hierarchy that actually works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Browser cache&lt;/li&gt;
&lt;li&gt;Edge cache&lt;/li&gt;
&lt;li&gt;Regional cache&lt;/li&gt;
&lt;li&gt;Core backend (last resort)&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Caching is the only performance optimization that actually works.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Yes, cache invalidation is hard.&lt;br&gt;
No, that doesn’t make skipping caching acceptable.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Designing for Failure (Lessons from 2025)
&lt;/h2&gt;

&lt;p&gt;Big recent outages  taught us painful lessons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Central auth can kill everything&lt;/li&gt;
&lt;li&gt;Global rollouts can cascade instantly&lt;/li&gt;
&lt;li&gt;One bad deploy ≠ one bad region&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2026 design rules:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Login should &lt;strong&gt;degrade&lt;/strong&gt;, not block&lt;/li&gt;
&lt;li&gt;Feature flags must &lt;strong&gt;fail open&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Rollouts must be &lt;strong&gt;regional&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Blast radius must be &lt;strong&gt;small&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Everything is highly available until the deploy.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Failure isn’t optional.&lt;br&gt;
&lt;strong&gt;Graceful failure is.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Cost Optimization (Because Finance Is Watching)
&lt;/h2&gt;

&lt;p&gt;How fast global apps stay cheap:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Serverless over always-on&lt;/li&gt;
&lt;li&gt;Edge over regional compute&lt;/li&gt;
&lt;li&gt;Async over synchronous&lt;/li&gt;
&lt;li&gt;Cache over compute&lt;/li&gt;
&lt;li&gt;Events over polling&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“Cloud bills are just performance bugs with receipts.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Edge requests are often &lt;strong&gt;10-100× cheaper&lt;/strong&gt; than traditional backend calls.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Realistic 2026 Stack (Example)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Static + partial hydration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CDN&lt;/td&gt;
&lt;td&gt;Global edge network&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compute&lt;/td&gt;
&lt;td&gt;Edge functions + serverless&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data&lt;/td&gt;
&lt;td&gt;Geo-replicated DB + edge KV&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Messaging&lt;/td&gt;
&lt;td&gt;Event streaming&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth&lt;/td&gt;
&lt;td&gt;Stateless tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;Distributed tracing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Notice what’s missing?&lt;br&gt;
No giant always-on clusters&lt;br&gt;
No global locks&lt;br&gt;
No central chokepoints&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;You can’t beat physics but you can beat perception&lt;/li&gt;
&lt;li&gt;Simpler systems scale better&lt;/li&gt;
&lt;li&gt;Global ≠ centralized&lt;/li&gt;
&lt;li&gt;Users forgive slowness but not inconsistency&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“If one region fails and no one notices, you designed it right.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Closing Thought
&lt;/h2&gt;

&lt;p&gt;In 2026, the best applications won't be impressive because they’ll be complex.&lt;/p&gt;

&lt;p&gt;They’ll be impressive because &lt;strong&gt;most of the system quietly stays out of the way&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That’s real engineering.&lt;/p&gt;




</description>
      <category>devops</category>
      <category>cicd</category>
      <category>cloud</category>
      <category>aws</category>
    </item>
    <item>
      <title>Oracle AI Database 26ai in Production</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Tue, 23 Dec 2025 04:12:50 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/oracle-ai-database-26ai-in-production-3f15</link>
      <guid>https://forem.com/kaustubhyerkade/oracle-ai-database-26ai-in-production-3f15</guid>
      <description>&lt;p&gt;&lt;strong&gt;Oracle AI Database 26ai&lt;/strong&gt; is released. It is a next-generation AI-native database release from Oracle that brings artificial intelligence into the core of the database engine.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.oracle.com/in/database/free/get-started/" rel="noopener noreferrer"&gt;https://www.oracle.com/in/database/free/get-started/&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  Oracle AI Database 26ai: Redefining the Database for the AI Era
&lt;/h1&gt;

&lt;p&gt;Databases have traditionally been designed to store data and answer structured queries.Artificial intelligence systems, on the other hand, evolved as separate platforms that required data to be copied, transformed, and synchronized outside the database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Oracle AI Database 26ai changes this model.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of moving data to AI systems, it brings AI capabilities directly into the database engine. Vector search, semantic similarity, and AI-ready retrieval now operate alongside SQL, transactions, and security controls. This allows engineers to build AI-powered applications without introducing new data pipelines or external vector databases.&lt;/p&gt;

&lt;p&gt;This article explains how Oracle AI Database 26ai works from a technical perspective and how it can be used to build real production-grade AI systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is Oracle AI Database 26ai?
&lt;/h2&gt;

&lt;p&gt;Oracle AI Database 26ai is Oracle’s newest long-term support release of its flagship database, designed for enterprises that need AI enabled data platforms. It replaces the earlier Oracle Database 23ai release, and all features from 23ai are included in 26ai. &lt;/p&gt;

&lt;p&gt;The core idea is to &lt;strong&gt;architect AI into the foundation of data management&lt;/strong&gt;, enabling intelligent processing, analytics, and application support without exporting data to external systems.&lt;/p&gt;

&lt;p&gt;Key highlights :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI built into the data engine&lt;/li&gt;
&lt;li&gt;Native vector support and similarity search&lt;/li&gt;
&lt;li&gt;Unified support for relational, document, JSON, graph, spatial, and other data types&lt;/li&gt;
&lt;li&gt;Developer productivity features like AI assisted code generation&lt;/li&gt;
&lt;li&gt;Enterprise-grade performance, governance, and security&lt;/li&gt;
&lt;li&gt;Distributed and multicloud deployment options&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why AI in the Database ?
&lt;/h2&gt;

&lt;p&gt;Traditional AI architectures often rely on multi-tier systems where data is moved from the database to feature stores, vector layers, or external services. This movement introduces complexity, latency, and security challenges.&lt;/p&gt;

&lt;p&gt;Oracle’s approach is different:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The database itself becomes a unified platform for data storage, vector computation, intelligence, and AI-driven workflows.&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;This design delivers benefits in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Performance:&lt;/strong&gt; Analytics and AI run where the data resides, reducing round-trip costs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security:&lt;/strong&gt; Data governance and access control are enforced at the database level.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplicity:&lt;/strong&gt; Fewer systems and pipelines to manage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability:&lt;/strong&gt; Enterprise workloads can scale with distributed database features.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Key Technical Innovations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Native Vector Search and Multimodal Support
&lt;/h3&gt;

&lt;p&gt;Oracle AI Database 26ai includes built-in support for vector data types, enabling semantic and similarity searches directly within SQL queries. This means you can index and search vectors generated from text, images, or other media without external systems. (&lt;a href="https://www.oracle.com/in/database/26ai/" rel="noopener noreferrer"&gt;Oracle&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Multimodal data becomes first-class: structured tables, JSON documents, graphs, and spatial data can all participate in vector-enabled workflows. (&lt;a href="https://docs.oracle.com/en/database/oracle/oracle-database/" rel="noopener noreferrer"&gt;Oracle Docs&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;This enables use cases like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Semantic search across enterprise knowledge bases&lt;/li&gt;
&lt;li&gt;Hybrid relational + vector queries&lt;/li&gt;
&lt;li&gt;Business logic combined with nearest-neighbor search&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Agentic AI Workflows
&lt;/h3&gt;

&lt;p&gt;Oracle AI Database 26ai introduces support for &lt;strong&gt;in-database AI agents&lt;/strong&gt;, frameworks that can execute autonomous or semi-autonomous tasks on behalf of applications. These agents can combine data access, analysis, decision logic, and actions without leaving the data platform. (&lt;a href="https://futurumgroup.com/insights/oracle-ai-world-2025-is-the-database-the-center-of-the-ai-universe-again/" rel="noopener noreferrer"&gt;Futurum&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;By making agents part of the database ecosystem, Oracle simplifies complex AI workflows that would otherwise require external orchestration layers.&lt;/p&gt;




&lt;h3&gt;
  
  
  Unified Lakehouse and Data Fabric
&lt;/h3&gt;

&lt;p&gt;A major component of the 26ai strategy is the &lt;strong&gt;Oracle Autonomous AI Lakehouse&lt;/strong&gt;, which supports open data formats like Apache Iceberg. This brings analytics, data governance, and AI together over large datasets that may reside in data lakes. (&lt;a href="https://www.oracle.com/in/news/announcement/ai-world-database-26ai-powers-the-ai-for-data-revolution-2025-10-14/" rel="noopener noreferrer"&gt;Oracle&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;The lakehouse capability blurs the traditional lines between OLTP, OLAP, and AI workloads, enabling:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unified governance&lt;/li&gt;
&lt;li&gt;Real-time analytics&lt;/li&gt;
&lt;li&gt;Schema flexibility&lt;/li&gt;
&lt;li&gt;Efficient storage for large datasets&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Developer Productivity and App Development
&lt;/h3&gt;

&lt;p&gt;Oracle AI Database 26ai also focuses on making AI development easier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Integrated support for AI-enhanced code generation&lt;/li&gt;
&lt;li&gt;Built-in APEX enhancements with AI assistant support&lt;/li&gt;
&lt;li&gt;JavaScript stored procedures and modern language support&lt;/li&gt;
&lt;li&gt;SQL and PL/SQL enhancements for AI-centric logic (&lt;a href="https://blogs.oracle.com/database/oracle-ai-database-26ai-a-comprehensive-foundation-of-enterprise-ai-for-data" rel="noopener noreferrer"&gt;Oracle Blogs&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means developers can build intelligent applications using familiar database tools and languages.&lt;/p&gt;




&lt;h2&gt;
  
  
  Enterprise Readiness
&lt;/h2&gt;

&lt;p&gt;Oracle has positioned 26ai for mission-critical workloads:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Availability on leading cloud platforms (OCI, AWS, Azure, GCP)&lt;/li&gt;
&lt;li&gt;On-premises support with upcoming releases for Linux x86-64&lt;/li&gt;
&lt;li&gt;Distributed database capabilities with Raft-based replication&lt;/li&gt;
&lt;li&gt;Integrated security, auditing, and governance features&lt;/li&gt;
&lt;li&gt;Free and developer editions to experiment with core features&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These capabilities make 26ai suitable for large enterprises that need both AI power and operational reliability.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Means for Practitioners
&lt;/h2&gt;

&lt;p&gt;For engineers and architects, Oracle AI Database 26ai represents a shift in how we build AI-powered systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Move complex AI logic &lt;strong&gt;closer to the data&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Reduce dependency on external vector databases or feature stores&lt;/li&gt;
&lt;li&gt;Leverage a single platform for analytics, AI, and transaction processing&lt;/li&gt;
&lt;li&gt;Maintain security and governance at enterprise scale&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adopting AI at the database layer simplifies many traditional challenges in AI pipelines, especially for regulated industries where data governance is critical.&lt;/p&gt;




&lt;p&gt;By architectural design, 26ai turns the database from a passive data store into an active AI platform. This represents a significant evolution in database engineering and opens a path for building high-performance, secure, AI-driven applications without complex external stacks.&lt;/p&gt;

&lt;p&gt;If you’re exploring AI for enterprise workloads, 26ai is worth serious technical evaluation.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Oracle AI Database 26ai&lt;/strong&gt; positions the database as an AI execution engine. That claim only matters if it holds up under:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Query planning&lt;/li&gt;
&lt;li&gt;Real retrieval pipelines&lt;/li&gt;
&lt;li&gt;Operational constraints&lt;/li&gt;
&lt;li&gt;Comparison with existing open-source stacks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s validate it properly.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. EXPLAIN PLAN for Vector Queries (What the Optimizer Actually Does)
&lt;/h2&gt;

&lt;p&gt;A common concern is whether vector search is treated as a first-class citizen or just an expensive function call.&lt;/p&gt;

&lt;p&gt;Consider this query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="n"&gt;PLAN&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;knowledge_base&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'payments'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;VECTOR_DISTANCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FETCH&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;ONLY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sample EXPLAIN PLAN Output (Simplified)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;------------------------------------------------------------
| Id | Operation                 | Name                |
------------------------------------------------------------
|  0 | SELECT STATEMENT           |                     |
|  1 |  SORT ORDER BY             |                     |
|  2 |   VECTOR INDEX RANGE SCAN  | KB_VECTOR_IDX       |
|  3 |    TABLE ACCESS BY INDEX   | KNOWLEDGE_BASE      |
------------------------------------------------------------
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;Key observations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;VECTOR INDEX RANGE SCAN&lt;/strong&gt; is a native access path&lt;/li&gt;
&lt;li&gt;Relational predicates (&lt;code&gt;category = 'payments'&lt;/code&gt;) are applied early&lt;/li&gt;
&lt;li&gt;Vector distance calculation is costed by the optimizer&lt;/li&gt;
&lt;li&gt;The plan is parallelizable like any other Oracle query&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is fundamentally different from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;External vector DB calls&lt;/li&gt;
&lt;li&gt;Post-filtering results in application code&lt;/li&gt;
&lt;li&gt;Pulling large candidate sets into memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Oracle 26ai, vector search participates in the same optimization framework as joins, filters, and aggregates.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. End-to-End RAG with OCI Generative AI
&lt;/h2&gt;

&lt;p&gt;Now let’s build a real Retrieval-Augmented Generation flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture Overview
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7thpknk0arfghjnnri2l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7thpknk0arfghjnnri2l.png" alt=" " width="800" height="584"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5pvq8g08foshywd9hk4b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5pvq8g08foshywd9hk4b.png" alt=" " width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkofuj31koih6xpm25ncs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkofuj31koih6xpm25ncs.png" alt=" " width="800" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User submits a natural language query&lt;/li&gt;
&lt;li&gt;Application generates an embedding&lt;/li&gt;
&lt;li&gt;Oracle 26ai retrieves relevant context securely&lt;/li&gt;
&lt;li&gt;Context is sent to OCI Generative AI&lt;/li&gt;
&lt;li&gt;Model generates a grounded response&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Step 1: Generate Query Embedding
&lt;/h3&gt;

&lt;p&gt;Typically done using OCI Generative AI embedding models.&lt;/p&gt;

&lt;p&gt;Pseudo-code (application layer):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cohere.embed-english&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_query&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 2: Secure Context Retrieval in Oracle 26ai
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;knowledge_base&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;VECTOR_DISTANCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;VECTOR_DISTANCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FETCH&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;ONLY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Important points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Row-level security applies automatically&lt;/li&gt;
&lt;li&gt;Masked columns remain masked&lt;/li&gt;
&lt;li&gt;Auditing records the access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This step is where most RAG systems fail from a governance perspective. Oracle does not.&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 3: Call OCI Generative AI with Retrieved Context
&lt;/h3&gt;

&lt;p&gt;Prompt template example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are an enterprise assistant.
Answer the question using only the context below.

Context:
{{retrieved_documents}}

Question:
{{user_query}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The database never exposes unauthorized data, and the model never hallucinates beyond approved context.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Oracle 26ai vs PostgreSQL + pgvector Comparison
&lt;/h2&gt;

&lt;p&gt;This comparison is not about ideology. It is about architecture.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Oracle AI Database 26ai&lt;/th&gt;
&lt;th&gt;PostgreSQL + pgvector&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Vector type&lt;/td&gt;
&lt;td&gt;Native kernel type&lt;/td&gt;
&lt;td&gt;Extension&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query optimizer&lt;/td&gt;
&lt;td&gt;Fully vector-aware&lt;/td&gt;
&lt;td&gt;Limited cost awareness&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security&lt;/td&gt;
&lt;td&gt;Row-level, masking, auditing&lt;/td&gt;
&lt;td&gt;Partial, app-driven&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAG governance&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Weak by default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HA and DR&lt;/td&gt;
&lt;td&gt;Built-in enterprise-grade&lt;/td&gt;
&lt;td&gt;DIY&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operational overhead&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Moderate to high&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compliance readiness&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Depends on setup&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Key Insight
&lt;/h3&gt;

&lt;p&gt;pgvector works well for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prototypes&lt;/li&gt;
&lt;li&gt;Research&lt;/li&gt;
&lt;li&gt;Small to medium workloads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Oracle 26ai is designed for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Regulated enterprises&lt;/li&gt;
&lt;li&gt;Mission-critical systems&lt;/li&gt;
&lt;li&gt;Large-scale operational AI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The difference is not performance alone.&lt;br&gt;
It is &lt;strong&gt;operational correctness under pressure&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Production Checklist for AI Databases
&lt;/h2&gt;

&lt;p&gt;Most AI failures are not model failures.&lt;br&gt;
They are architecture and operations failures.&lt;/p&gt;

&lt;p&gt;Here is a practical checklist for running AI workloads inside a database.&lt;/p&gt;




&lt;h3&gt;
  
  
  Data and Schema Design
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Choose embedding dimension deliberately&lt;/li&gt;
&lt;li&gt;Separate raw text and embeddings&lt;/li&gt;
&lt;li&gt;Version embeddings if models change&lt;/li&gt;
&lt;li&gt;Normalize metadata for relational filtering&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Indexing and Performance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Create vector indexes explicitly&lt;/li&gt;
&lt;li&gt;Monitor vector index usage&lt;/li&gt;
&lt;li&gt;Validate EXPLAIN PLAN regularly&lt;/li&gt;
&lt;li&gt;Use relational predicates to reduce search space&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Security and Governance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Enforce row-level security on source tables&lt;/li&gt;
&lt;li&gt;Mask sensitive columns before RAG&lt;/li&gt;
&lt;li&gt;Enable auditing for AI access paths&lt;/li&gt;
&lt;li&gt;Treat embeddings as sensitive data&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Operational Readiness
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Include vector indexes in backup strategy&lt;/li&gt;
&lt;li&gt;Test DR scenarios with AI workloads&lt;/li&gt;
&lt;li&gt;Monitor vector query latency separately&lt;/li&gt;
&lt;li&gt;Capacity-plan for mixed OLTP + AI workloads&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Model and Prompt Hygiene
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Store prompt templates in versioned tables&lt;/li&gt;
&lt;li&gt;Log prompts and responses for traceability&lt;/li&gt;
&lt;li&gt;Use strict system prompts for RAG&lt;/li&gt;
&lt;li&gt;Avoid free-form context injection&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  DevOps and Platform Integration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Treat AI queries as first-class workloads&lt;/li&gt;
&lt;li&gt;Add SLOs for AI retrieval latency&lt;/li&gt;
&lt;li&gt;Monitor plan regressions after upgrades&lt;/li&gt;
&lt;li&gt;Align AI deployments with DB change management&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Perspective
&lt;/h2&gt;

&lt;p&gt;Oracle AI Database 26ai does not try to replace models. It replaces fragile architecture.&lt;/p&gt;

&lt;p&gt;By making vectors, retrieval, security, and optimization part of the database engine, Oracle reduces the number of places AI systems can fail.&lt;/p&gt;

&lt;p&gt;For enterprises, this matters more than novelty.&lt;/p&gt;

&lt;p&gt;AI is not impressive when it works in a demo.&lt;br&gt;
It is impressive when it works &lt;strong&gt;every day, under load, under audit&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Oracle 26ai is clearly designed for that reality.&lt;/p&gt;







</description>
      <category>oracle</category>
      <category>26ai</category>
      <category>oracledb</category>
    </item>
    <item>
      <title>⚡ RDMA: The Networking Tech That Quietly Runs the Modern Internet</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Sun, 21 Dec 2025 11:40:00 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/rdma-the-networking-tech-that-quietly-runs-the-modern-internet-1f2e</link>
      <guid>https://forem.com/kaustubhyerkade/rdma-the-networking-tech-that-quietly-runs-the-modern-internet-1f2e</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;It’s 2:17 AM.&lt;/em&gt;&lt;br&gt;
Your dashboard is green.&lt;br&gt;
CPU usage is low.&lt;br&gt;
Latency graphs look… &lt;strong&gt;too good to be true&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Yet your distributed database is pushing &lt;strong&gt;hundreds of gigabytes per second&lt;/strong&gt; across machines.&lt;/p&gt;

&lt;p&gt;No magic.&lt;br&gt;
No cheating.&lt;/p&gt;

&lt;p&gt;Just &lt;strong&gt;RDMA&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  first, the uncomfortable truth
&lt;/h2&gt;

&lt;p&gt;Most of us learned networking like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Send data → TCP/IP → kernel → CPU → copy buffers → receive → copy again”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And we accepted it.&lt;/p&gt;

&lt;p&gt;But modern systems looked at this and said:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Why are CPUs doing delivery jobs?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And that’s where &lt;strong&gt;RDMA&lt;/strong&gt; was born.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is RDMA ?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;RDMA (Remote Direct Memory Access)&lt;/strong&gt; allows &lt;strong&gt;one machine to directly read or write the memory of another machine over the network without involving the remote CPU or OS kernel&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Yes, thats right.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory-to-memory communication over the network.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqbzsfik02sxvxzwa3rce.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqbzsfik02sxvxzwa3rce.png" alt=" " width="588" height="264"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How networking traditionally works ?
&lt;/h2&gt;

&lt;p&gt;When you send data over TCP:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;App → Kernel&lt;/li&gt;
&lt;li&gt;Kernel → Network stack&lt;/li&gt;
&lt;li&gt;CPU handles interrupts&lt;/li&gt;
&lt;li&gt;Data copied multiple times&lt;/li&gt;
&lt;li&gt;Receiver CPU wakes up&lt;/li&gt;
&lt;li&gt;Kernel copies again&lt;/li&gt;
&lt;li&gt;App finally gets data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Result:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Good latency&lt;/li&gt;
&lt;li&gt;CPU overhead&lt;/li&gt;
&lt;li&gt;Cache pollution&lt;/li&gt;
&lt;li&gt;Context switching hell&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How RDMA works ?
&lt;/h2&gt;

&lt;p&gt;With RDMA:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;App tells NIC where memory is&lt;/li&gt;
&lt;li&gt;NIC sends data directly&lt;/li&gt;
&lt;li&gt;Remote NIC writes &lt;strong&gt;straight into RAM&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Remote CPU is not disturbed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Result 🚀:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Zero-copy&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sub-microsecond latency&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Near-zero CPU usage&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Line-rate throughput (100–400 Gbps)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The “wait… what?” moment
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;The &lt;strong&gt;remote CPU doesn’t even know&lt;/strong&gt; data arrived.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;No syscall.&lt;br&gt;
No interrupt.&lt;br&gt;
No kernel involvement.&lt;/p&gt;

&lt;p&gt;This is called &lt;strong&gt;one-sided communication&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key RDMA concepts
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Memory Registration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Memory is “pinned”&lt;/li&gt;
&lt;li&gt;NIC gets permission to access it&lt;/li&gt;
&lt;li&gt;Prevents OS from moving it&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Queue Pairs (QP)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Send Queue + Receive Queue&lt;/li&gt;
&lt;li&gt;Managed entirely by hardware&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. One-sided ops
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RDMA WRITE&lt;/strong&gt; → write remote memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RDMA READ&lt;/strong&gt;  → read remote memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RDMA SEND/RECV&lt;/strong&gt; → like messaging, but faster&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  RDMA vs TCP
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;TCP/IP&lt;/th&gt;
&lt;th&gt;RDMA&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;CPU usage&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Almost zero&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory copies&lt;/td&gt;
&lt;td&gt;Many&lt;/td&gt;
&lt;td&gt;Zero&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Latency&lt;/td&gt;
&lt;td&gt;~10–100 µs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;&amp;lt; 1 µs&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Throughput&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Line rate&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel involvement&lt;/td&gt;
&lt;td&gt;Required&lt;/td&gt;
&lt;td&gt;Bypassed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Where RDMA is ACTUALLY used
&lt;/h2&gt;

&lt;p&gt;This isn’t theory. RDMA powers real systems:&lt;/p&gt;

&lt;h3&gt;
  
  
  Distributed Databases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;In-memory DBs&lt;/li&gt;
&lt;li&gt;Sharded OLTP engines&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Storage
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;NVMe over Fabrics&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Distributed file systems&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AI / ML
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;GPU-to-GPU communication&lt;/li&gt;
&lt;li&gt;Large model training&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Finance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;High-frequency trading&lt;/li&gt;
&lt;li&gt;Real-time risk engines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If something is &lt;strong&gt;fast + distributed&lt;/strong&gt;, RDMA is probably underneath.&lt;/p&gt;




&lt;h2&gt;
  
  
  RDMA doesn’t mean “new cables”
&lt;/h2&gt;

&lt;p&gt;RDMA runs over existing networks:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Runs On&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;InfiniBand&lt;/td&gt;
&lt;td&gt;Native RDMA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RoCE&lt;/td&gt;
&lt;td&gt;Ethernet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;iWARP&lt;/td&gt;
&lt;td&gt;TCP/IP&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most cloud data centers use &lt;strong&gt;RoCE&lt;/strong&gt; today.&lt;/p&gt;

&lt;p&gt;RoCE Header format-&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgnt3p9vypzdvw49lq0hh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgnt3p9vypzdvw49lq0hh.png" alt=" " width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Examples of RDMA NICs:&lt;br&gt;
Intel: Ethernet 810/820 Series (iWARP &amp;amp; RoCEv2).&lt;br&gt;
Marvell: FastLinQ (Universal RDMA, RoCE &amp;amp; iWARP).&lt;br&gt;
Chelsio: Known for high-performance RDMA adapters. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjqg7azf3pehwqd6857jl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjqg7azf3pehwqd6857jl.png" alt=" " width="800" height="219"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why RDMA isn’t everywhere
&lt;/h2&gt;

&lt;p&gt;Because RDMA is &lt;strong&gt;powerful but dangerous&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requires special NICs&lt;/li&gt;
&lt;li&gt;Needs lossless networks&lt;/li&gt;
&lt;li&gt;Hard to debug&lt;/li&gt;
&lt;li&gt;Memory bugs = catastrophic&lt;/li&gt;
&lt;li&gt;Not beginner-friendly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;RDMA is like giving your NIC &lt;strong&gt;root access to RAM&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;So, Handle with care&lt;/p&gt;




&lt;p&gt;Even if you never write RDMA code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your &lt;strong&gt;databases&lt;/strong&gt; use it&lt;/li&gt;
&lt;li&gt;Your &lt;strong&gt;cloud storage&lt;/strong&gt; depends on it&lt;/li&gt;
&lt;li&gt;Your &lt;strong&gt;AI workloads&lt;/strong&gt; require it&lt;/li&gt;
&lt;li&gt;Your &lt;strong&gt;latency SLAs&lt;/strong&gt; benefit from it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Understanding RDMA explains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why cloud infra feels “unreasonably fast”&lt;/li&gt;
&lt;li&gt;Why CPUs aren’t the bottleneck anymore&lt;/li&gt;
&lt;li&gt;Why networking is becoming hardware-driven&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The mental model that sticks
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;TCP/IP =&lt;br&gt;
“Please OS, copy this data safely and let me know”&lt;/p&gt;

&lt;p&gt;RDMA =&lt;br&gt;
“Hey NIC, write directly into that memory address over there”&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  One-line takeaway
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;RDMA is how modern data centers move data at near-memory speed across machines.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;RDMA isn’t hype.&lt;br&gt;
It’s &lt;strong&gt;invisible infrastructure.&lt;/strong&gt; the kind that changes everything quietly.&lt;/p&gt;

&lt;p&gt;The next time you see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;insane throughput&lt;/li&gt;
&lt;li&gt;suspiciously low CPU&lt;/li&gt;
&lt;li&gt;lightning-fast distributed systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You’ll know.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RDMA is working.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;References - &lt;br&gt;
&lt;a href="https://wiki.debian.org/RDMA" rel="noopener noreferrer"&gt;https://wiki.debian.org/RDMA&lt;/a&gt;&lt;br&gt;
&lt;a href="https://en.wikipedia.org/wiki/RDMA_over_Converged_Ethernet" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/RDMA_over_Converged_Ethernet&lt;/a&gt;&lt;br&gt;
&lt;a href="https://resource.fs.com/mall/resource/intel-e810-series-ethernet-network-adapters-datasheet-20251125121144.pdf" rel="noopener noreferrer"&gt;https://resource.fs.com/mall/resource/intel-e810-series-ethernet-network-adapters-datasheet-20251125121144.pdf&lt;/a&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>networking</category>
      <category>performance</category>
    </item>
    <item>
      <title>Google Cloud Platform (GCP) Explained | The Engineer’s Cloud ☁️</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Sat, 20 Dec 2025 06:03:03 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/google-cloud-platform-gcp-explained-the-engineers-cloud-1j6</link>
      <guid>https://forem.com/kaustubhyerkade/google-cloud-platform-gcp-explained-the-engineers-cloud-1j6</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“AWS feels like a supermarket. Azure feels like enterprise paperwork.&lt;br&gt;
GCP feels like... an engineer built it.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you’ve ever opened the GCP console and thought:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“This looks clean. but different”&lt;/li&gt;
&lt;li&gt;“Why is everything so data centric?”&lt;/li&gt;
&lt;li&gt;“Why do Kubernetes and BigQuery feel native here?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This blog is for you.&lt;/p&gt;

&lt;p&gt;Let’s &lt;strong&gt;demystify Google Cloud Platform,&lt;/strong&gt; how it works, why it exists, and why engineers quietly love it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Google Cloud Platform (GCP)?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Google Cloud Platform&lt;/strong&gt; is Google’s public cloud offering. the same backbone that powers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Google Search&lt;/li&gt;
&lt;li&gt;YouTube&lt;/li&gt;
&lt;li&gt;Gmail&lt;/li&gt;
&lt;li&gt;Google Maps&lt;/li&gt;
&lt;li&gt;Android ecosystem&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unlike other clouds that &lt;em&gt;adapted&lt;/em&gt; to scale later, &lt;strong&gt;Google was born at planetary scale&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That DNA matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  GCP’s Philosophy (This Is the Big Difference)
&lt;/h2&gt;

&lt;p&gt;Most clouds say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Here’s infrastructure. You manage it.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;GCP says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“Here’s a distributed system. Don’t worry about servers.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Core ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Containers first &lt;/li&gt;
&lt;li&gt;Data everywhere&lt;/li&gt;
&lt;li&gt;Managed &amp;gt; Manual&lt;/li&gt;
&lt;li&gt;SRE &amp;gt; SysAdmin&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Google doesn’t want you babysitting servers.&lt;br&gt;
They want you &lt;strong&gt;shipping systems&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  How GCP Is Structured
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmqmwjyoh7wfnyzvxumk7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmqmwjyoh7wfnyzvxumk7.png" alt="Image" width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frinvhkgtasx92hbhtpze.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frinvhkgtasx92hbhtpze.png" alt="Image" width="670" height="349"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc9o4yov1t4ux5mugk027.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc9o4yov1t4ux5mugk027.png" alt="Image" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Global Infrastructure
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Regions → Zones&lt;/li&gt;
&lt;li&gt;Private global fiber network&lt;/li&gt;
&lt;li&gt;Same network Google uses internally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your VM in Mumbai talks to BigQuery in the US &lt;strong&gt;without touching public internet&lt;/strong&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Core Layers
&lt;/h3&gt;

&lt;p&gt;Think in &lt;strong&gt;layers&lt;/strong&gt;, not services.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Compute&lt;/td&gt;
&lt;td&gt;Run code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;Store data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Networking&lt;/td&gt;
&lt;td&gt;Connect everything&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data&lt;/td&gt;
&lt;td&gt;Analyze everything&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DevOps&lt;/td&gt;
&lt;td&gt;Build &amp;amp; deploy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security&lt;/td&gt;
&lt;td&gt;Protect by default&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  3. Compute: Where Your Code Runs
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Futp546imtxr211la6hrl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Futp546imtxr211la6hrl.png" alt="Image" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcloud.google.com%2Fstatic%2Fkubernetes-engine%2Fimages%2Fgke-architecture.svg%3Futm_source%3Dchatgpt.com" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcloud.google.com%2Fstatic%2Fkubernetes-engine%2Fimages%2Fgke-architecture.svg%3Futm_source%3Dchatgpt.com" alt="Image" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrgqq70kqm5i7oisj5yy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrgqq70kqm5i7oisj5yy.png" alt="Image" width="800" height="875"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Compute Engine (GCE)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Virtual machines&lt;/li&gt;
&lt;li&gt;Custom machine types&lt;/li&gt;
&lt;li&gt;Per-second billing &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Google Kubernetes Engine (GKE)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes &lt;strong&gt;created by Google&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Industry gold standard&lt;/li&gt;
&lt;li&gt;Auto-scaling, auto-repair, auto peace :)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cloud Run (Serverless)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Deploy a container&lt;/li&gt;
&lt;li&gt;Zero infrastructure thinking&lt;/li&gt;
&lt;li&gt;Scales from &lt;strong&gt;0 → millions&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If AWS is EC2-first, &lt;strong&gt;GCP is Kubernetes first&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Storage &amp;amp; Databases (Google Loves Data)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fny6sva3r9b2p0hvo6alj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fny6sva3r9b2p0hvo6alj.png" alt="Image" width="800" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frpmnxeu2t2rlcguiwvq1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frpmnxeu2t2rlcguiwvq1.jpg" alt="Image" width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3zsq8fyd4g6nul201pbu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3zsq8fyd4g6nul201pbu.png" alt="Image" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Cloud Storage
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Object storage&lt;/li&gt;
&lt;li&gt;Simple, fast, global&lt;/li&gt;
&lt;li&gt;Ideal for logs, backups, media&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  BigQuery - The Star🌟
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Serverless data warehouse&lt;/li&gt;
&lt;li&gt;Query TBs in seconds&lt;/li&gt;
&lt;li&gt;SQL without infra pain&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;BigQuery feels like cheating,  and that’s the point.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Cloud Spanner
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Globally consistent SQL database&lt;/li&gt;
&lt;li&gt;Horizontal scaling&lt;/li&gt;
&lt;li&gt;Used by Google internally&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Data Engineering &amp;amp; Streaming (GCP’s Superpower)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbwafs2aqx3p755ynrmoi.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbwafs2aqx3p755ynrmoi.PNG" alt="Image" width="700" height="259"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foaf1932elnukjwfuwvw3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foaf1932elnukjwfuwvw3.png" alt="Image" width="580" height="326"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4njlbdjaxjntiguk9oni.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4njlbdjaxjntiguk9oni.jpg" alt="Image" width="800" height="518"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🔸 Pub/Sub
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Global messaging system&lt;/li&gt;
&lt;li&gt;Event-driven architectures&lt;/li&gt;
&lt;li&gt;Handles insane throughput&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔸 Dataflow (Apache Beam)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Unified batch + streaming&lt;/li&gt;
&lt;li&gt;Managed Flink-like pipelines&lt;/li&gt;
&lt;li&gt;No cluster management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re a &lt;strong&gt;data engineer,&lt;/strong&gt; GCP feels like home.&lt;/p&gt;




&lt;h2&gt;
  
  
  DevOps on GCP (Clean &amp;amp; Opinionated)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F56p3k1fjss4l8jzjpjfa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F56p3k1fjss4l8jzjpjfa.png" alt="Image" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjb7e8ch64yr5as0jrdqb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjb7e8ch64yr5as0jrdqb.png" alt="Image" width="458" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F096cpbx20dqrfughxtcp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F096cpbx20dqrfughxtcp.png" alt="Image" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🔹 Cloud Build
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Serverless CI/CD&lt;/li&gt;
&lt;li&gt;YAML pipelines&lt;/li&gt;
&lt;li&gt;GitHub / GitLab friendly&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔹 Artifact Registry
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Docker, Maven, npm, Python&lt;/li&gt;
&lt;li&gt;One registry to rule them all&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔹 Cloud Operations (Stackdriver)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Logs&lt;/li&gt;
&lt;li&gt;Metrics&lt;/li&gt;
&lt;li&gt;Traces&lt;/li&gt;
&lt;li&gt;Alerts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Logging on GCP is &lt;strong&gt;ridiculously good&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Security: Quietly Excellent
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;IAM is &lt;strong&gt;resource centric&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Default encryption everywhere&lt;/li&gt;
&lt;li&gt;Zero Trust mindset&lt;/li&gt;
&lt;li&gt;VPC Service Controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You don’t &lt;em&gt;add&lt;/em&gt; security later, it’s baked in.&lt;/p&gt;




&lt;h2&gt;
  
  
  GCP vs AWS vs Azure (Quick Reality Check)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;GCP&lt;/th&gt;
&lt;th&gt;AWS&lt;/th&gt;
&lt;th&gt;Azure&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Kubernetes&lt;/td&gt;
&lt;td&gt;Best&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data &amp;amp; Analytics&lt;/td&gt;
&lt;td&gt;Best&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Average&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UI Simplicity&lt;/td&gt;
&lt;td&gt;Clean&lt;/td&gt;
&lt;td&gt;Cluttered&lt;/td&gt;
&lt;td&gt;Corporate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enterprise Legacy&lt;/td&gt;
&lt;td&gt;Meh&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Best&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Learning Curve&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Steep&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Who Should Choose GCP?
&lt;/h2&gt;

&lt;p&gt;Choose GCP if you are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;DevOps / SRE&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;Data Engineer&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;Java + Beam + Flink developer&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Building &lt;strong&gt;event-driven systems&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Tired of managing servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Avoid GCP if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need legacy enterprise tooling&lt;/li&gt;
&lt;li&gt;You want every service under the sun (AWS wins here)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why Engineers Fall in Love with GCP
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes just &lt;em&gt;works&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;BigQuery changes how you think about data&lt;/li&gt;
&lt;li&gt;Clean console&lt;/li&gt;
&lt;li&gt;Less ops, more engineering&lt;/li&gt;
&lt;li&gt;Google grade infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;GCP doesn’t scream.&lt;br&gt;
It &lt;strong&gt;delivers quietly&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;AWS teaches you cloud.&lt;br&gt;
Azure teaches you enterprise.&lt;br&gt;
&lt;strong&gt;GCP teaches you distributed systems.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you truly want to understand &lt;strong&gt;how modern systems are built&lt;/strong&gt;, GCP is not optional — it’s essential.&lt;/p&gt;




&lt;p&gt;Hope you like this introductory post on GCP. &lt;br&gt;
If you want next part, Please let me know in the comments. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Building a Production-Grade Java App on GCP&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GCP Data Engineering Pipeline (Beam + Dataflow + BigQuery)&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AWS vs GCP  Architecture War Stories&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;How Google Runs SRE Internally&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>beginners</category>
      <category>cloudcomputing</category>
      <category>googlecloud</category>
    </item>
    <item>
      <title>Chaos Engineering: Breaking Production on Purpose (So It Never Breaks Again)</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Sat, 13 Dec 2025 10:00:07 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/chaos-engineering-breaking-production-on-purpose-so-it-never-breaks-again-401j</link>
      <guid>https://forem.com/kaustubhyerkade/chaos-engineering-breaking-production-on-purpose-so-it-never-breaks-again-401j</guid>
      <description>&lt;h2&gt;
  
  
  1. The 2 AM Pager Story
&lt;/h2&gt;

&lt;p&gt;It’s &lt;strong&gt;2:00 AM&lt;/strong&gt;. You are deep asleep like makka pakka.&lt;br&gt;
Then your phone vibrates like it’s possessed. &lt;br&gt;
Slack is exploding. Grafana dashboards are red.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0mrnx7mh74ws011qj5p2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0mrnx7mh74ws011qj5p2.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The message says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🚨 &lt;em&gt;Production Down - High Error Rate&lt;/em&gt; &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But wait…&lt;br&gt;
This system was &lt;em&gt;highly available&lt;/em&gt;.&lt;br&gt;
Multi-AZ. Auto scaling. Health checks. Load balancers.&lt;br&gt;
&lt;strong&gt;All the right boxes were checked.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So what went wrong ?&lt;/p&gt;

&lt;p&gt;A single node died.&lt;br&gt;
A cache dependency slowed down.&lt;br&gt;
Retries snowballed.&lt;br&gt;
Threads got exhausted.&lt;br&gt;
And suddenly... everything collapsed.&lt;/p&gt;

&lt;p&gt;That night teaches you one painful truth-&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Just because a system looks reliable on paper doesn’t mean it survives real failure.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A legend once said- The Best Way to Prevent Outages? Cause Them First.&lt;/p&gt;

&lt;p&gt;Welcome to &lt;strong&gt;Chaos Engineering&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. What Is Chaos Engineering ?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;One-liner:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Chaos Engineering is the practice of &lt;em&gt;intentionally breaking things&lt;/em&gt; to learn how your system behaves under failure.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;In simple words:&lt;/strong&gt;&lt;br&gt;
You don’t wait for outages to teach you lessons.&lt;br&gt;
You &lt;strong&gt;create controlled failures&lt;/strong&gt; on your terms; during working hours so production doesn’t teach you lessons at 2 AM.&lt;/p&gt;

&lt;p&gt;Why it exists:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Modern systems are &lt;strong&gt;distributed&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Failures are &lt;strong&gt;inevitable&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Humans are &lt;strong&gt;bad at predicting edge cases&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Chaos Engineering accepts reality instead of fighting it.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Why Traditional Testing Is Not Enough
&lt;/h2&gt;

&lt;p&gt;Let’s be honest.&lt;/p&gt;

&lt;p&gt;We already do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Unit tests&lt;/li&gt;
&lt;li&gt; Integration tests&lt;/li&gt;
&lt;li&gt; Load tests&lt;/li&gt;
&lt;li&gt; UAT&lt;/li&gt;
&lt;li&gt; Pre-prod validations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And yet production still fails.&lt;/p&gt;

&lt;p&gt;Why?&lt;/p&gt;

&lt;p&gt;Because traditional testing assumes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dependencies behave normally&lt;/li&gt;
&lt;li&gt;Networks are reliable&lt;/li&gt;
&lt;li&gt;Latency is predictable&lt;/li&gt;
&lt;li&gt;Partial failures won’t cascade&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;In Reality:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Databases slow down, not just crash&lt;/li&gt;
&lt;li&gt;Networks lie&lt;/li&gt;
&lt;li&gt;Third-party APIs timeout randomly&lt;/li&gt;
&lt;li&gt;Distributed systems fail in &lt;em&gt;creative&lt;/em&gt; ways&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Most outages come from &lt;strong&gt;unknown unknowns&lt;/strong&gt; , not code bugs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Chaos Engineering is how you &lt;em&gt;discover those unknowns before users do&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4mqh8621t4nahvq3ku5x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4mqh8621t4nahvq3ku5x.png" alt=" " width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Core Principles of Chaos Engineering
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Define Steady State
&lt;/h3&gt;

&lt;p&gt;What does “healthy” look like?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Request success rate&lt;/li&gt;
&lt;li&gt;Latency percentiles&lt;/li&gt;
&lt;li&gt;Error budgets&lt;/li&gt;
&lt;li&gt;Business KPIs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you don’t define this, you’re just breaking stuff blindly.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Inject Real Failures
&lt;/h3&gt;

&lt;p&gt;Not mocks. Not simulations but Real failures like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Killing pods&lt;/li&gt;
&lt;li&gt;Adding latency&lt;/li&gt;
&lt;li&gt;Breaking network calls&lt;/li&gt;
&lt;li&gt;Throttling CPU&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3. Run Experiments in Production (Carefully)
&lt;/h3&gt;

&lt;p&gt;Yes, &lt;strong&gt;production&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why?&lt;br&gt;
Because only production has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real traffic&lt;/li&gt;
&lt;li&gt;Real data&lt;/li&gt;
&lt;li&gt;Real chaos&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But this is done:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gradually&lt;/li&gt;
&lt;li&gt;During safe windows&lt;/li&gt;
&lt;li&gt;With rollback plans&lt;/li&gt;
&lt;li&gt;Scheduled downtimes&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4. Automate and Learn Continuously
&lt;/h3&gt;

&lt;p&gt;Chaos is not a one-time stunt.&lt;br&gt;
It’s a &lt;strong&gt;continuous feedback loop&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Common Chaos Experiments With Examples
&lt;/h2&gt;

&lt;p&gt;Here’s what teams &lt;em&gt;actually&lt;/em&gt; break &lt;/p&gt;

&lt;h3&gt;
  
  
  Kill Pods / Instances
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl delete pod payment-service-xyz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt;&lt;br&gt;
Does traffic reroute smoothly?&lt;br&gt;
Do users notice?&lt;/p&gt;




&lt;h3&gt;
  
  
  Network Latency &amp;amp; Packet Loss
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Add 500ms latency between services&lt;/li&gt;
&lt;li&gt;Drop 10% packets&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Exposes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Retry storms&lt;/li&gt;
&lt;li&gt;Timeout misconfigurations&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Dependency Failures
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Database slows down&lt;/li&gt;
&lt;li&gt;Redis unavailable&lt;/li&gt;
&lt;li&gt;Third-party API returns 500&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reality check:&lt;/strong&gt;&lt;br&gt;
Can your service degrade gracefully?&lt;/p&gt;




&lt;h3&gt;
  
  
  Resource Starvation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;CPU throttling&lt;/li&gt;
&lt;li&gt;Memory pressure&lt;/li&gt;
&lt;li&gt;Disk full&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These failures are &lt;em&gt;far more common&lt;/em&gt; than total crashes.&lt;/p&gt;




&lt;h3&gt;
  
  
  AZ / Region Failure
&lt;/h3&gt;

&lt;p&gt;Simulate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One Availability Zone going down&lt;/li&gt;
&lt;li&gt;Load balancer losing backends&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;This is where “multi-AZ” claims are tested.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Chaos Engineering in Kubernetes &amp;amp; Cloud
&lt;/h2&gt;

&lt;p&gt;Kubernetes makes chaos &lt;em&gt;easy&lt;/em&gt; (sometimes too easy).&lt;/p&gt;

&lt;h3&gt;
  
  
  Kubernetes Chaos
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Kill pods randomly&lt;/li&gt;
&lt;li&gt;Drain nodes&lt;/li&gt;
&lt;li&gt;Evict workloads&lt;/li&gt;
&lt;li&gt;Break DNS&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cloud-Native Chaos
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Terminate EC2 instances&lt;/li&gt;
&lt;li&gt;Throttle IAM permissions&lt;/li&gt;
&lt;li&gt;Break network routes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Popular Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chaos Monkey&lt;/strong&gt; - OG chaos tool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LitmusChaos&lt;/strong&gt;  - Kubernetes-native, open source&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gremlin&lt;/strong&gt;      - Controlled, enterprise-grade chaos&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS FIS&lt;/strong&gt;      - Native AWS fault injection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tools don’t do chaos engineering.&lt;br&gt;
&lt;strong&gt;Mindset does.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  7. A Short, Realistic Scenario
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Setup
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Java Spring Boot microservice&lt;/li&gt;
&lt;li&gt;Kubernetes (EKS)&lt;/li&gt;
&lt;li&gt;HPA enabled&lt;/li&gt;
&lt;li&gt;Redis cache + PostgreSQL DB&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Chaos Experiment
&lt;/h3&gt;

&lt;p&gt;Kill 50% of pods during peak traffic&lt;/p&gt;

&lt;h3&gt;
  
  
  What Failed
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Connection pool exhausted&lt;/li&gt;
&lt;li&gt;Retry logic hammered DB&lt;/li&gt;
&lt;li&gt;Latency spiked beyond SLA&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What Chaos Exposed
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No circuit breaker&lt;/li&gt;
&lt;li&gt;Aggressive retries&lt;/li&gt;
&lt;li&gt;Poor timeout configuration&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What Was Fixed
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Added Resilience4j&lt;/li&gt;
&lt;li&gt;Tuned retries &amp;amp; timeouts&lt;/li&gt;
&lt;li&gt;Improved readiness probes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;br&gt;
Same failure today -&amp;gt; users don’t even notice.&lt;/p&gt;

&lt;p&gt;That’s chaos engineering working.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Myths &amp;amp; Misconceptions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  “1. Chaos engineering is reckless”
&lt;/h3&gt;

&lt;p&gt;No.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Uncontrolled production outages are reckless.&lt;/strong&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  “2. Only Netflix-scale companies need it”
&lt;/h3&gt;

&lt;p&gt;If your system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Has users&lt;/li&gt;
&lt;li&gt;Has SLAs&lt;/li&gt;
&lt;li&gt;Has on-call engineers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need it.&lt;/p&gt;




&lt;h3&gt;
  
  
  “3. It means randomly breaking things”
&lt;/h3&gt;

&lt;p&gt;Wrong.&lt;/p&gt;

&lt;p&gt;Chaos is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hypothesis-driven&lt;/li&gt;
&lt;li&gt;Measured&lt;/li&gt;
&lt;li&gt;Reversible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Random breaking is just… bad ops.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. When You SHOULD and SHOULD NOT Do Chaos Engineering
&lt;/h2&gt;

&lt;h3&gt;
  
  
  You SHOULD when:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Monitoring &amp;amp; alerts are solid&lt;/li&gt;
&lt;li&gt;Rollback is easy&lt;/li&gt;
&lt;li&gt;Error budgets exist&lt;/li&gt;
&lt;li&gt;Team understands the system&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  You SHOULD NOT when:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;You can’t observe failures&lt;/li&gt;
&lt;li&gt;You don’t know steady state&lt;/li&gt;
&lt;li&gt;You don’t have on-call coverage&lt;/li&gt;
&lt;li&gt;Everything is already unstable&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Chaos without observability is just noise.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  10. Benefits You Actually Get
&lt;/h2&gt;

&lt;p&gt;Not buzzwords. Real outcomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fewer production outages&lt;/li&gt;
&lt;li&gt;Faster incident response&lt;/li&gt;
&lt;li&gt;Safer deployments&lt;/li&gt;
&lt;li&gt;Better system design&lt;/li&gt;
&lt;li&gt;Confident on-call engineers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You stop &lt;em&gt;hoping&lt;/em&gt; things work.&lt;br&gt;
You &lt;strong&gt;know&lt;/strong&gt; they do.&lt;/p&gt;




&lt;h2&gt;
  
  
  11. How to Start Chaos Engineering, Beginner-Friendly.
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step-by-Step Starter Plan
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Pick &lt;strong&gt;one critical service&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Define steady-state metrics&lt;/li&gt;
&lt;li&gt;Start in &lt;strong&gt;non-prod&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Kill a single pod&lt;/li&gt;
&lt;li&gt;Observe everything&lt;/li&gt;
&lt;li&gt;Fix weaknesses&lt;/li&gt;
&lt;li&gt;Repeat&lt;/li&gt;
&lt;li&gt;Slowly move to prod&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  First Chaos Experiments
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Pod kill during low traffic&lt;/li&gt;
&lt;li&gt;Add latency to one dependency&lt;/li&gt;
&lt;li&gt;Simulate DB slowness&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Small chaos beats no chaos.&lt;/p&gt;




&lt;h2&gt;
  
  
  12. Conclusion
&lt;/h2&gt;

&lt;p&gt;Chaos Engineering is not about breaking systems.It’s about breaking &lt;strong&gt;assumptions&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Failure is feedback.&lt;/strong&gt;&lt;br&gt;
Ignore it, and production will remind you loudly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The best SREs and DevOps engineers don’t fear failure.&lt;br&gt;
They &lt;strong&gt;schedule it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqg9hokns7n2i5rgrrjow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqg9hokns7n2i5rgrrjow.png" alt=" " width="800" height="633"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Your Turn
&lt;/h3&gt;

&lt;p&gt;If you killed one thing in your production system &lt;em&gt;today&lt;/em&gt;,&lt;br&gt;
&lt;strong&gt;what do you think would break first?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Drop your thoughts, war stories, or doubts in the comments.&lt;br&gt;
Let’s learn from each other before the pager rings again.&lt;/p&gt;

</description>
      <category>chaosengineering</category>
      <category>devops</category>
    </item>
    <item>
      <title>AWS for Developers: The Guide</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Sat, 13 Dec 2025 08:25:44 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/aws-for-developers-the-only-guide-you-actually-need-in-2026-901</link>
      <guid>https://forem.com/kaustubhyerkade/aws-for-developers-the-only-guide-you-actually-need-in-2026-901</guid>
      <description>&lt;p&gt;Cloud is no longer a “nice to have.” If you are building, scaling, shipping, or even experimenting. AWS is where most of the world runs its production workloads.&lt;/p&gt;

&lt;p&gt;But here’s the problem -&amp;gt; AWS has over 200 services. New developers often feel like they’re entering a jungle with no map.&lt;/p&gt;

&lt;p&gt;This article is that map.&lt;br&gt;
Short. Concise. Practical.&lt;/p&gt;

&lt;p&gt;Let’s go.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why AWS Still Rules the Cloud
&lt;/h2&gt;

&lt;p&gt;Even with big players like Azure and GCP growing fast, AWS dominates because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Most complete ecosystem (Compute, Storage, Database, AI, DevOps… literally everything)&lt;/li&gt;
&lt;li&gt;Battle tested global infrastructure&lt;/li&gt;
&lt;li&gt;Deep enterprise adoption&lt;/li&gt;
&lt;li&gt;Strong tooling for DevOps, automation, and infra-as-code&lt;/li&gt;
&lt;li&gt;Leader in serverless (Lambda), container orchestration (ECS/EKS), and data engineering&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're a developer, AWS is more than a cloud provider, it's a career multiplier.&lt;/p&gt;
&lt;h2&gt;
  
  
  The AWS Big 5:
&lt;/h2&gt;

&lt;p&gt;The Only Services Every Developer Must Know. These are workload services used to run actual applications:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmjoby66izcmxbte9hg9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmjoby66izcmxbte9hg9.png" alt=" " width="800" height="187"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Amazon EC2 - Virtual Machines, Classic Style&lt;/li&gt;
&lt;li&gt;Amazon S3 - The Unlimited Storage Bucket&lt;/li&gt;
&lt;li&gt;AWS Lambda - Serverless Magic&lt;/li&gt;
&lt;li&gt;Amazon RDS - Databases Without the Ops Pain&lt;/li&gt;
&lt;li&gt;Amazon DynamoDB - The NoSQL Supercharger&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Forget the 200+ service list.&lt;br&gt;
These are the 5 services that drive 80% of real-world applications.&lt;/p&gt;
&lt;h2&gt;
  
  
  Let's see them one by one - 
&lt;/h2&gt;
&lt;h2&gt;
  
  
  1. Amazon EC2- Elastic Compute Cloud
&lt;/h2&gt;

&lt;p&gt;The foundational compute service of AWS.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it’s top tier:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Runs your applications on virtual servers&lt;/li&gt;
&lt;li&gt;Full OS-level control&lt;/li&gt;
&lt;li&gt;Autoscaling + Load Balancing support&lt;/li&gt;
&lt;li&gt;Best for legacy applications, backend services, batch jobs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example uses:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Application servers&lt;/li&gt;
&lt;li&gt;Backend microservices&lt;/li&gt;
&lt;li&gt;Game servers&lt;/li&gt;
&lt;li&gt;High-performance web apps&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  2.Amazon S3- Simple Storage Service
&lt;/h2&gt;

&lt;p&gt;The backbone of modern cloud storage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why everyone uses it:&lt;/strong&gt;&lt;br&gt;
Unlimited file storage&lt;br&gt;
Extremely durable (99.999999999% durability)&lt;br&gt;
Cheap and scalable&lt;br&gt;
Stores everything from images to logs to ML datasets&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example uses:&lt;/strong&gt;&lt;br&gt;
Static website hosting&lt;br&gt;
Backup &amp;amp; archiving&lt;br&gt;
Data lakes&lt;br&gt;
CI/CD artifacts&lt;/p&gt;
&lt;h2&gt;
  
  
  3.AWS Lambda- Serverless Compute
&lt;/h2&gt;

&lt;p&gt;Run code without managing servers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it’s a top 5 service:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Only pay when code runs&lt;/li&gt;
&lt;li&gt;Scales automatically&lt;/li&gt;
&lt;li&gt;Integrates with 200+ AWS services&lt;/li&gt;
&lt;li&gt;Ideal for event-driven architectures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example uses:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API backend&lt;/li&gt;
&lt;li&gt;Automation scripts&lt;/li&gt;
&lt;li&gt;CRON jobs&lt;/li&gt;
&lt;li&gt;Data processing&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  4.Amazon RDS- Relational Database Service
&lt;/h2&gt;

&lt;p&gt;Managed SQL databases without DBA overhead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it’s essential:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automated backups&lt;/li&gt;
&lt;li&gt;Multi-AZ replication&lt;/li&gt;
&lt;li&gt;High availability built-in&lt;/li&gt;
&lt;li&gt;Supports MySQL, PostgreSQL, SQL Server, Oracle, Aurora&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example uses:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Banking apps&lt;/li&gt;
&lt;li&gt;ERP / CRM systems&lt;/li&gt;
&lt;li&gt;Transactional websites&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  5. Amazon DynamoDB- Fully Managed NoSQL Database
&lt;/h2&gt;

&lt;p&gt;A super-fast, massively scalable NoSQL solution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why top companies love it:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Millisecond low-latency reads/writes&lt;/li&gt;
&lt;li&gt;Auto-scaling to millions of requests&lt;/li&gt;
&lt;li&gt;Zero-downtime operations&lt;/li&gt;
&lt;li&gt;Serverless + event-driven&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example uses:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shopping carts&lt;/li&gt;
&lt;li&gt;Gaming state data&lt;/li&gt;
&lt;li&gt;IoT device storage&lt;/li&gt;
&lt;li&gt;Microservices&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  AWS Developer Workflow:
&lt;/h2&gt;

&lt;p&gt;How Modern Teams Build Apps on AWS&lt;/p&gt;

&lt;p&gt;Here’s what a typical production-grade AWS architecture looks like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Frontend → CloudFront → S3&lt;/li&gt;
&lt;li&gt;API → API Gateway → Lambda/ECS&lt;/li&gt;
&lt;li&gt;Database → RDS / DynamoDB&lt;/li&gt;
&lt;li&gt;Messaging → SNS / SQS&lt;/li&gt;
&lt;li&gt;CI/CD → CodePipeline / GitHub Actions&lt;/li&gt;
&lt;li&gt;Infra → Terraform / CDK&lt;/li&gt;
&lt;li&gt;Monitoring → CloudWatch / X-Ray&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This stack is scalable, fault-tolerant, and cost-efficient.&lt;/p&gt;


&lt;h2&gt;
  
  
  AWS DevOps Essentials
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure as Code (IaC)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Tools:&lt;br&gt;
Terraform&lt;br&gt;
AWS CDK&lt;br&gt;
CloudFormation&lt;/p&gt;

&lt;p&gt;Using IaC ensures:&lt;/p&gt;

&lt;p&gt;Version-controlled infra&lt;br&gt;
Consistent deployments&lt;br&gt;
Automated scaling + repeatability&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CI/CD with AWS&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Common choices:&lt;/p&gt;

&lt;p&gt;AWS CodePipeline&lt;br&gt;
GitHub Actions&lt;br&gt;
GitLab CI&lt;br&gt;
Jenkins on EC2&lt;/p&gt;

&lt;p&gt;Typical pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Build → Test → Security Scan → Deploy → Verify
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Monitoring &amp;amp; Logging&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CloudWatch Logs &amp;amp; Metrics&lt;br&gt;
AWS X-Ray for tracing&lt;br&gt;
CloudTrail for auditing&lt;br&gt;
OpenSearch for log analytics&lt;/p&gt;

&lt;p&gt;Observability is non-negotiable in 2025.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Summery Cheat Sheet
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rank&lt;/th&gt;
&lt;th&gt;AWS Service&lt;/th&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Why It's Top 5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;⭐ 1&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;EC2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Compute&lt;/td&gt;
&lt;td&gt;Full control, flexible, widely used for apps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⭐ 2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;S3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;Durable, cheap, global, foundation for many systems&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⭐ 3&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Lambda&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Serverless&lt;/td&gt;
&lt;td&gt;Zero server management, event-driven apps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⭐ 4&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;RDS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;Managed SQL, scalable, secure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⭐ 5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;DynamoDB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;NoSQL&lt;/td&gt;
&lt;td&gt;High-speed, fully managed, microservice-friendly&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Part 2
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;How AWS works&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What the core services are&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;How they connect together&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A mental model to stop AWS from feeling chaotic&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ve written this in &lt;strong&gt;simple language&lt;/strong&gt;, with &lt;strong&gt;engineering intuition&lt;/strong&gt;, not marketing fluff.&lt;/p&gt;




&lt;h1&gt;
  
  
  AWS Without the Headache: A Simple Mental Model That Finally Makes Sense
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;“AWS is not hard. AWS is too big to explain at once.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you’re new to AWS (or even experienced), you’ve probably felt this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Too many services&lt;/li&gt;
&lt;li&gt;Weird names (EC2, IAM, VPC, ALB, EKS... why?)&lt;/li&gt;
&lt;li&gt;Every tutorial starts in the &lt;strong&gt;middle&lt;/strong&gt;, never at the beginning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So let’s do something different.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We won’t start with services&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;We’ll start with how AWS actually works&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  First: AWS is like a Massive Digital City
&lt;/h2&gt;

&lt;p&gt;AWS is &lt;strong&gt;infrastructure&lt;/strong&gt;, rented by the hour.&lt;/p&gt;

&lt;p&gt;Imagine AWS as a &lt;strong&gt;city&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;City Concept&lt;/th&gt;
&lt;th&gt;AWS Equivalent&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Land&lt;/td&gt;
&lt;td&gt;AWS Regions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Buildings&lt;/td&gt;
&lt;td&gt;Data Centers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Electricity&lt;/td&gt;
&lt;td&gt;Compute (EC2, Lambda)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Roads&lt;/td&gt;
&lt;td&gt;Networking (VPC, Subnets)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security Guards&lt;/td&gt;
&lt;td&gt;IAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warehouses&lt;/td&gt;
&lt;td&gt;S3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Post Office&lt;/td&gt;
&lt;td&gt;SNS / SQS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Once you see this, AWS stops feeling scary.&lt;/p&gt;




&lt;h2&gt;
  
  
  Regions &amp;amp; Availability Zones -The Foundation
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fznxehdkwdaujiqehcgch.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fznxehdkwdaujiqehcgch.png" alt=" " width="800" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Before &lt;strong&gt;any service&lt;/strong&gt;, AWS gives you &lt;strong&gt;geography&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Region
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Physical location (Mumbai, Frankfurt, Virginia)&lt;/li&gt;
&lt;li&gt;Completely isolated from other regions&lt;/li&gt;
&lt;li&gt;A region is basically a cluster of data centers. &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Availability Zone (AZ)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;One region has &lt;strong&gt;multiple AZs&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Each AZ = independent data center&lt;/li&gt;
&lt;li&gt;Used for &lt;strong&gt;high availability&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mumbai region is having 3 AZs - &lt;br&gt;
ap-south-1a -&lt;br&gt;
ap-south-1b&lt;br&gt;
ap-south-1c&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Region Code (e.g., us-east-1, ap-south-1):&lt;/strong&gt; Identifies the broad geographical area (e.g., US East, Asia Pacific) and its specific location.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Letter Suffix (e.g., a, b, c, 1, 2, 3):&lt;/strong&gt; Designates the individual, physically separate data center within that region, with unique power, cooling, and networking. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Deploy across &lt;strong&gt;multiple AZs&lt;/strong&gt;, not multiple regions (unless required).&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Core AWS Building Blocks (90% of AWS)
&lt;/h2&gt;

&lt;p&gt;Most AWS architectures boil down to &lt;strong&gt;5 core pillars&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6jk983eo06644ahlorg.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6jk983eo06644ahlorg.jpeg" alt="Image" width="686" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3l0hllyw1b90690fda5w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3l0hllyw1b90690fda5w.png" alt="Image" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Compute - “Where Code Runs”
&lt;/h2&gt;

&lt;p&gt;This is your &lt;strong&gt;CPU + RAM&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkq9kxmmnffwbq9hhohs2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkq9kxmmnffwbq9hhohs2.png" alt="Image" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F367d9zkkal6v2pzpwxfg.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F367d9zkkal6v2pzpwxfg.webp" alt="Image" width="800" height="571"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;When to Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EC2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full control, legacy apps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lambda&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No servers, event-driven&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ECS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simple containers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EKS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Kubernetes workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you understand &lt;strong&gt;EC2 + Lambda&lt;/strong&gt;, you already understand compute.&lt;/p&gt;




&lt;h2&gt;
  
  
  Storage - Where Data Lives
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pan6wfg9hrwj90bw5re.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8pan6wfg9hrwj90bw5re.png" alt="Image" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg5jm23m19f5pw800unwz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg5jm23m19f5pw800unwz.png" alt="Image" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;What It Is&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;S3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Object storage (files, backups)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EBS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Disk for EC2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EFS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shared filesystem&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Golden Rule&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Files / backups → &lt;strong&gt;S3&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;OS disk → &lt;strong&gt;EBS&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Networking
&lt;/h2&gt;

&lt;p&gt;This is where most people panic &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6u1zbfb2ofr9mu6ku5qe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6u1zbfb2ofr9mu6ku5qe.png" alt="Image" width="747" height="517"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd47gn4cd5s24480txspg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd47gn4cd5s24480txspg.png" alt="Image" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  VPC (Virtual Private Cloud)
&lt;/h3&gt;

&lt;p&gt;Your &lt;strong&gt;private network&lt;/strong&gt; inside AWS.&lt;/p&gt;

&lt;p&gt;Inside a VPC:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Subnets (public / private)&lt;/li&gt;
&lt;li&gt;Route tables&lt;/li&gt;
&lt;li&gt;Internet Gateway&lt;/li&gt;
&lt;li&gt;NAT Gateway&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;💡 &lt;strong&gt;Simple idea&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;VPC = Your own data center network, but virtual&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  4. Security – “Who Can Do What”
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1umlnvilpzi5d6cn22tl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1umlnvilpzi5d6cn22tl.png" alt="Image" width="800" height="709"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frlykppp90ekl5ruzumxf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frlykppp90ekl5ruzumxf.png" alt="Image" width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  IAM (Identity &amp;amp; Access Management)
&lt;/h3&gt;

&lt;p&gt;Controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Who&lt;/strong&gt; can access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What&lt;/strong&gt; they can do&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Key concepts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Users (humans)&lt;/li&gt;
&lt;li&gt;Roles (services)&lt;/li&gt;
&lt;li&gt;Policies (permissions)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Never use root user&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Always use roles for EC2/Lambda&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Databases – “Where Structured Data Lives”
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkeb2on6lg4geomup3y8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkeb2on6lg4geomup3y8.png" alt="Image" width="800" height="334"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcjop069ol5eyryotu9zk.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcjop069ol5eyryotu9zk.jpeg" alt="Image" width="432" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RDS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SQL (MySQL, Postgres)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Aurora&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cloud-native SQL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DynamoDB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;NoSQL, massive scale&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  How AWS Services Connect -The Flow
&lt;/h2&gt;

&lt;p&gt;Let’s connect everything with a &lt;strong&gt;real example&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqfu3v01lsi607u2ym3lg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqfu3v01lsi607u2ym3lg.png" alt="Image" width="800" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg35bukotxcgxmsgst7qr.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg35bukotxcgxmsgst7qr.jpeg" alt="Image" width="800" height="1205"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  A Simple Web App Flow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;User hits your website&lt;/li&gt;
&lt;li&gt;Route53 resolves DNS&lt;/li&gt;
&lt;li&gt;Load Balancer receives traffic&lt;/li&gt;
&lt;li&gt;EC2 / ECS / Lambda processes request&lt;/li&gt;
&lt;li&gt;Data fetched from RDS / DynamoDB&lt;/li&gt;
&lt;li&gt;Files served from S3&lt;/li&gt;
&lt;li&gt;Logs go to CloudWatch&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Okay....&lt;br&gt;
That’s &lt;strong&gt;80% of AWS architectures&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why AWS Feels Overwhelming ?
&lt;/h2&gt;

&lt;p&gt;AWS looks huge because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every feature is a &lt;strong&gt;separate service&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Names are technical, not friendly&lt;/li&gt;
&lt;li&gt;Docs assume prior cloud knowledge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But underneath…&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;AWS = Compute + Storage + Network + Security + Database&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Everything else is &lt;strong&gt;just an add-on&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Recommend Learning AWS with a Practical Path
&lt;/h2&gt;

&lt;p&gt;As a DevOps engineer, this is the order that &lt;em&gt;actually works&lt;/em&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;EC2&lt;/li&gt;
&lt;li&gt;VPC basics&lt;/li&gt;
&lt;li&gt;IAM roles&lt;/li&gt;
&lt;li&gt;S3&lt;/li&gt;
&lt;li&gt;RDS&lt;/li&gt;
&lt;li&gt;Load Balancer&lt;/li&gt;
&lt;li&gt;Auto Scaling&lt;/li&gt;
&lt;li&gt;Then containers &amp;amp; serverless&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Don’t start with EKS.&lt;br&gt;
Don’t chase certifications blindly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;AWS is not meant to be learned &lt;strong&gt;top-down&lt;/strong&gt;.&lt;br&gt;
It’s meant to be learned &lt;strong&gt;layer-by-layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Once you build the mental model, AWS stops being scary&lt;br&gt;
and starts feeling like &lt;strong&gt;Lego blocks for engineers&lt;/strong&gt;&lt;/p&gt;




</description>
    </item>
    <item>
      <title>🔥Java Spring Framework &amp; Spring Boot : A simple, no-nonsense guide that actually makes sense</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Fri, 12 Dec 2025 03:26:11 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/java-spring-framework-spring-boot-a-simple-no-nonsense-guide-that-actually-makes-sense-ed</link>
      <guid>https://forem.com/kaustubhyerkade/java-spring-framework-spring-boot-a-simple-no-nonsense-guide-that-actually-makes-sense-ed</guid>
      <description>&lt;p&gt;&lt;strong&gt;Spring Framework &amp;amp; Spring Boot quietly run half the world’s backend systems including banking, fintech, telecom, e-commerce, logistics, and even government platforms.&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;Five'ish years ago, when I first started working Spring project, I was completely confused. Everything felt overwhelming. Annotations, beans, IoC, DI, controllers, configurations… &lt;/p&gt;

&lt;p&gt;I still remember the exact moment my manager walked up to me and said:&lt;/p&gt;

&lt;p&gt;“Hey, can you build this module on Spring?”&lt;/p&gt;

&lt;p&gt;I confidently replied:&lt;br&gt;
“Yes, of course!”&lt;br&gt;
(because saying no felt illegal on day one.)&lt;/p&gt;

&lt;p&gt;But inside my head I was like:&lt;/p&gt;

&lt;p&gt;“Spring? Which Spring?" &lt;/p&gt;

&lt;p&gt;I had already worked with Servlets, JSP, Struts…&lt;br&gt;
I understood MVC…&lt;br&gt;
I had deployed Java apps…&lt;/p&gt;

&lt;p&gt;But Spring felt like an alien framework that landed straight from another planet.&lt;/p&gt;

&lt;p&gt;When I opened the project on STS, My screen was full of strange things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;@Autowired&lt;/li&gt;
&lt;li&gt;@Controller&lt;/li&gt;
&lt;li&gt;@Component&lt;/li&gt;
&lt;li&gt;beans&lt;/li&gt;
&lt;li&gt;contexts&lt;/li&gt;
&lt;li&gt;configurations&lt;/li&gt;
&lt;li&gt;application.properties&lt;/li&gt;
&lt;li&gt;XML vs annotation&lt;/li&gt;
&lt;li&gt;IoC container&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And I remember thinking:&lt;/p&gt;

&lt;p&gt;“Bro… where is main()?”&lt;br&gt;
“Who is creating objects?”&lt;br&gt;
“What is this magic? Why is nothing obvious?”&lt;/p&gt;

&lt;p&gt;I tried reading docs, blogs, StackOverflow. Chatgpt wasn't a thing back then...&lt;br&gt;
But every explanation used even more confusing words:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inversion of Control&lt;/li&gt;
&lt;li&gt;Dependency Injection&lt;/li&gt;
&lt;li&gt;AOP&lt;/li&gt;
&lt;li&gt;Bean lifecycle&lt;/li&gt;
&lt;li&gt;Application context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I still remember the day I managed to create my first @RestController and return "Hello World".&lt;/p&gt;

&lt;p&gt;It felt like I had cracked the top secret code.But those early days were full of head-scratching. &lt;/p&gt;

&lt;p&gt;Despite knowing the old-school Java concepts like Core Java OOP, Servlets &amp;amp; JSP, Struts, JDBC etc. Spring felt totally different. Even with Java experience, Spring felt like a brand-new world, almost a different mindset.&lt;/p&gt;

&lt;p&gt;Over time, after working on real projects, breaking things, fixing things, and understanding the architecture deeply, everything finally clicked. But that learning curve was steep.Long. And honestly… unnecessary.&lt;/p&gt;

&lt;p&gt;I’m writing this blog because I don’t want today's beginners to feel the same confusion I had,  even though I came from traditional Java technologies, Spring still felt like a big shift.&lt;/p&gt;

&lt;p&gt;If you’re new to Spring, I hope this blog becomes the explanation I wish I had 6 years ago.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Does Spring Exist?
&lt;/h2&gt;

&lt;p&gt;Before Spring came into the market, Java developers were struggling.&lt;br&gt;
Even if you knew Core Java concepts, building enterprise applications was painful. Everything was tightly coupled , too much XML , hard to test , slower build times. And dependent on heavy J2EE servers (WebSphere, WebLogic). Means Java was powerful, but not developer-friendly.&lt;/p&gt;

&lt;p&gt;Imagine you’re building a big Java application using traditional way.&lt;br&gt;
Suddenly you realize:&lt;/p&gt;

&lt;p&gt;You are writing too much boilerplate&lt;br&gt;
You’re manually managing objects&lt;br&gt;
You’re copying repetitive code&lt;br&gt;
You’re configuring things every 5 minutes&lt;br&gt;
You’re stressed, tired, and thinking of switching careers&lt;/p&gt;

&lt;p&gt;Spring Framework saved devs from this suffering , Spring said:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Let’s make Java development simple again.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Rod Johnson&lt;/strong&gt; wrote a book in 2002 called “Expert One-on-One J2EE Design and Development.”&lt;/p&gt;

&lt;p&gt;In that book, he proposed a new, simple approach using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;lightweight containers&lt;/li&gt;
&lt;li&gt;dependency injection&lt;/li&gt;
&lt;li&gt;POJOs instead of EJBs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This idea became the foundation of the Spring Framework, released in 2003.&lt;br&gt;
The goal was simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“Make enterprise Java simpler, more productive, and more enjoyable.”&lt;/strong&gt;&lt;br&gt;
Spring completely transformed Java by introducing:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Old Java (Before Spring)&lt;/th&gt;
&lt;th&gt;With Spring&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Heavy EJB containers&lt;/td&gt;
&lt;td&gt;Lightweight IoC containers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;XML everywhere&lt;/td&gt;
&lt;td&gt;Annotations &amp;amp; auto config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hard testing&lt;/td&gt;
&lt;td&gt;Easy unit tests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual object wiring&lt;/td&gt;
&lt;td&gt;Dependency Injection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slow, expensive deployment&lt;/td&gt;
&lt;td&gt;Fast startup with embedded servers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Strict, rigid design&lt;/td&gt;
&lt;td&gt;Flexible, modular architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Spring made Java modern, and later Spring Boot made it effortless.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is the Spring Framework?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Spring Framework is a set of tools and features that help you build Java applications easily.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The secret superpower of Spring is:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.Dependency Injection (DI)&lt;/strong&gt;&lt;br&gt;
Spring says:&lt;br&gt;
“Bro, don’t create objects manually… Let me handle that for you.”&lt;/p&gt;

&lt;p&gt;It manages the lifecycle of objects, so your code becomes clean, testable, readable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2.Inversion of Control (IoC)&lt;/strong&gt;&lt;br&gt;
Instead of you controlling the app, the framework controls the flow.&lt;/p&gt;

&lt;p&gt;You simply tell Spring: “When someone needs a Car, give them this CarService.”&lt;/p&gt;

&lt;p&gt;Spring says: “Say less.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3.A Giant Swiss Army Knife for Java&lt;/strong&gt;&lt;br&gt;
Spring provides ready to use modules:&lt;br&gt;
&lt;strong&gt;Spring MVC → web apps&lt;br&gt;
Spring Data JPA → working with databases&lt;br&gt;
Spring Security → login, JWT, OAuth&lt;br&gt;
Spring AOP → cross-cutting concerns&lt;br&gt;
Spring Test → testing made easy&lt;/strong&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;AND THEN SPRING BOOT ARRIVED !!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What is Spring Boot?
&lt;/h2&gt;

&lt;p&gt;Spring Framework was great but configuring it also felt like assembling IKEA Desk.&lt;/p&gt;

&lt;p&gt;Enter Spring Boot (2014).&lt;br&gt;
It changed everything.&lt;/p&gt;

&lt;p&gt;Spring Boot =&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spring, but with nitrous included&lt;/li&gt;
&lt;li&gt;Zero config&lt;/li&gt;
&lt;li&gt;Auto-magic setup&lt;/li&gt;
&lt;li&gt;Production ready&lt;/li&gt;
&lt;li&gt;Faster than instant noodles&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Spring Boot is a framework that helps you build production ready Java apps with minimal configuration.&lt;/p&gt;

&lt;p&gt;You don’t need:&lt;/p&gt;

&lt;p&gt;❌ xml configuration&lt;br&gt;
❌ server setup&lt;br&gt;
❌ WAR files&lt;br&gt;
❌ complicated deployments&lt;/p&gt;

&lt;p&gt;Boot gives you:&lt;/p&gt;

&lt;p&gt;✔ Embedded servers (Tomcat, Jetty)&lt;br&gt;
✔ Auto-configurations&lt;br&gt;
✔ Starters for everything&lt;br&gt;
✔ Actuator for monitoring&lt;br&gt;
✔ Easy integration with databases, JMS, Kafka, Redis, etc.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Spring Framework&lt;/th&gt;
&lt;th&gt;Spring Boot&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Setup&lt;/td&gt;
&lt;td&gt;Heavy&lt;/td&gt;
&lt;td&gt;Super light&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Configuration&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;Auto-config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Packaging&lt;/td&gt;
&lt;td&gt;WAR&lt;/td&gt;
&lt;td&gt;JAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embedded Server&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;td&gt;FAST FAST!!&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ideal For&lt;/td&gt;
&lt;td&gt;Complex, legacy apps&lt;/td&gt;
&lt;td&gt;Microservices, cloud, DevOps&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Final Thoughts, Why You Should Learn Spring Boot ?
&lt;/h2&gt;

&lt;p&gt;Massive job market&lt;br&gt;
Real-world impact&lt;br&gt;
Clean code&lt;br&gt;
Cloud-first architecture&lt;br&gt;
Future-proof skills&lt;/p&gt;

&lt;p&gt;And honestly, it’s fun to build with.....&lt;/p&gt;

</description>
      <category>java</category>
      <category>springboot</category>
      <category>devops</category>
      <category>spring</category>
    </item>
    <item>
      <title>How to Read &amp; Understand Config Files.</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Sun, 07 Dec 2025 20:52:24 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/how-to-read-understand-config-files-neg</link>
      <guid>https://forem.com/kaustubhyerkade/how-to-read-understand-config-files-neg</guid>
      <description>&lt;p&gt;&lt;strong&gt;config files are everywhere.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;They decide how your app starts, how your servers run, how your test pipeline behaves, how your Kubernetes pods survive, and sometimes… why your app randomly crashes at 2 AM (because someone used a tab instead of space).&lt;/p&gt;

&lt;p&gt;But here’s the truth:&lt;/p&gt;

&lt;p&gt;Most of us don’t know how to properly read a config file.&lt;br&gt;
We search for error, guess, trial-and-error and hope for the best.&lt;/p&gt;

&lt;p&gt;I am Writing this guide to change that.&lt;/p&gt;

&lt;p&gt;Whether it’s .YAML, .JSON, HCL, .INI, Or the og .XML, we’ll learn a universal method to read, understand, debug, and master any configuration file like a pro.&lt;/p&gt;

&lt;p&gt;Lets start with basics - &lt;/p&gt;
&lt;h2&gt;
  
  
  Why Config Files Even Exist
&lt;/h2&gt;

&lt;p&gt;Config files let us separate code from behaviour.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your code stays clean.&lt;/li&gt;
&lt;li&gt;Your logic stays flexible.&lt;/li&gt;
&lt;li&gt;Your ops team stays sane.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Examples:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;application.properties → Java Spring Boot&lt;/li&gt;
&lt;li&gt;values.yaml → Helm chart&lt;/li&gt;
&lt;li&gt;terraform.tf → Terraform&lt;/li&gt;
&lt;li&gt;config.json → Node.js app&lt;/li&gt;
&lt;li&gt;docker-compose.yml → Multi-container setup&lt;/li&gt;
&lt;li&gt;nginx.conf → Web Server rules&lt;/li&gt;
&lt;li&gt;httpd.conf → Web server rules&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Configs = Data, Not Logic
&lt;/h2&gt;

&lt;p&gt;Config files describe things. They don’t execute things.&lt;br&gt;
So when reading a config:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don’t look for logic. Look for structure.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don’t guess meaning. Understand relationships.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don’t treat it like code. Treat it like a map.&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  How to Read ANY Config File
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1.Identify the Format&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;2.Read Top to Bottom like a Story&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;3.Identify Sections&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;4.Look for Dependencies&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;5.Find Defaults &amp;amp; Overrides&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;6.Validate Against Documentation&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;7. Always Check Env-Specific Overrides&lt;/strong&gt;&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;1.Identify the Format&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before understanding meaning, understand syntax.&lt;br&gt;
Here’s a quick cheat sheet:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Common Use&lt;/th&gt;
&lt;th&gt;What It Looks Like&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;YAML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Kubernetes, Ansible, GitHub Actions&lt;/td&gt;
&lt;td&gt;Indentation is everything&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JSON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;APIs, Node.js, configs&lt;/td&gt;
&lt;td&gt;&lt;code&gt;{ "key": "value" }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;INI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Legacy apps&lt;/td&gt;
&lt;td&gt;&lt;code&gt;key=value&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Properties&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Java apps&lt;/td&gt;
&lt;td&gt;&lt;code&gt;port=8080&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HCL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Terraform&lt;/td&gt;
&lt;td&gt;&lt;code&gt;resource "aws_s3_bucket" {}&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;XML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Maven, Android&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;tag&amp;gt;value&amp;lt;/tag&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;2.Read Top to Bottom like a Story&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;All config files have a hierarchy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;server:
  port: 8080
  security:
    enabled: true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read it as a story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It’s about a server&lt;/li&gt;
&lt;li&gt;Server has a port&lt;/li&gt;
&lt;li&gt;Server has a security section&lt;/li&gt;
&lt;li&gt;Security has a property called enabled&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You're not reading code, you're reading a description.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3.Identify Sections&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most configs contain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Metadata (version, name)&lt;/li&gt;
&lt;li&gt;Inputs (environment, variables)&lt;/li&gt;
&lt;li&gt;Rules (policies, limits, resources)&lt;/li&gt;
&lt;li&gt;Outputs (final results or settings)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example: Kubernetes Deployment&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;metadata:
  name: webapp
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: app
          image: node:18
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Breakdown:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;metadata → Label&lt;/li&gt;
&lt;li&gt;spec → What to deploy&lt;/li&gt;
&lt;li&gt;replicas → How many&lt;/li&gt;
&lt;li&gt;template/spec → What runs inside&lt;/li&gt;
&lt;li&gt;containers → Which Docker images&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Make mental buckets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4.Look for Dependencies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Some parts depend on others.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;services:
  backend:
  frontend:
    depends_on:
      - backend
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Understanding dependency flow is crucial.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5.Find Defaults &amp;amp; Overrides&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Configs often override inherited defaults.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resources:
  limits:
    cpu: 500m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If something behaves unexpectedly, it's usually:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;inherited from elsewhere&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;overridden somewhere&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Validate Against Documentation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Never assume meaning. Every config has a reference:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes: kube docs&lt;/li&gt;
&lt;li&gt;Terraform: registry.terraform.io&lt;/li&gt;
&lt;li&gt;Spring Boot: spring-boot-properties list&lt;/li&gt;
&lt;li&gt;AWS: official docs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;1 misinterpreted parameter = hours of debugging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7.Always Check Env-Specific Overrides&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prod overrides dev.&lt;br&gt;
Ops overrides engineering.&lt;br&gt;
Secrets override defaults.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Example (application-prod.properties):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;db.password=${SECRET_DB_PASSWORD}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configs are layered. Always check:&lt;/p&gt;

&lt;p&gt;.env&lt;br&gt;
values-prod.yaml&lt;br&gt;
overrides.tfvars&lt;br&gt;
secret.yaml&lt;/p&gt;


&lt;h2&gt;
  
  
  Common Mistakes While Reading Config Files
&lt;/h2&gt;

&lt;p&gt;❌ Using tabs in YAML&lt;br&gt;
YAML hates tabs.&lt;/p&gt;

&lt;p&gt;❌ Misunderstanding indentation&lt;br&gt;
Indentation = hierarchy.&lt;br&gt;
Hierarchy = meaning.&lt;/p&gt;

&lt;p&gt;❌ Assuming a property exists&lt;br&gt;
e.g., restart: always for Docker is not valid in Kubernetes.&lt;/p&gt;

&lt;p&gt;❌ Copy-pasting from StackOverflow&lt;br&gt;
Different environments ≠ same config.&lt;/p&gt;

&lt;p&gt;❌ Missing commas in JSON&lt;br&gt;
JSON breaks easily.&lt;/p&gt;



&lt;p&gt;Let’s Read a Complex Terraform Block&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resource "aws_instance" "web" {
  ami           = var.web_ami
  instance_type = "t3.medium"

  tags = {
    Name = "web-server"
  }

  lifecycle {
    create_before_destroy = true
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's Breakdown:&lt;/p&gt;

&lt;p&gt;resource "aws_instance" "web" → This block creates an EC2&lt;br&gt;
ami → Which OS image&lt;br&gt;
instance_type → How big&lt;br&gt;
tags → Metadata&lt;br&gt;
lifecycle → Special behaviour&lt;br&gt;
create_before_destroy = true → Ensure zero downtime&lt;/p&gt;

&lt;p&gt;You’re reading infrastructure as a story.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Keep a mental model:&lt;/strong&gt;&lt;br&gt;
Config = Hierarchical Data Structure&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Always validate (YAML lint, JSON lint)&lt;/strong&gt;&lt;br&gt;
Saves hours.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Document your configs&lt;/strong&gt;&lt;br&gt;
Future-you will thank you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don’t fear big configs&lt;/strong&gt;&lt;br&gt;
Break them into sections → read → interpret.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Railway.app - DevOps Friendly Deployment Tool</title>
      <dc:creator>kaustubh yerkade</dc:creator>
      <pubDate>Sat, 06 Dec 2025 15:18:06 +0000</pubDate>
      <link>https://forem.com/kaustubhyerkade/railwayapp-devops-friendly-deployment-tool-5aab</link>
      <guid>https://forem.com/kaustubhyerkade/railwayapp-devops-friendly-deployment-tool-5aab</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;What if deploying an app was as easy as pushing to GitHub?&lt;br&gt;
Railway heard this… and said: Challenge accepted.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Because nearly every developer &amp;amp; DevOps engineer (especially in Java, Node, Python, Go) hits the same wall:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Where do I deploy this small project quickly?”&lt;br&gt;
“I don’t want to set up EC2, VPC, ALB, IAM, etc.”&lt;br&gt;
“Why is deployment harder than writing the actual code?”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And suddenly… Railway.app entered the scene.&lt;/p&gt;

&lt;p&gt;Its like Heroku 2.0 cleaner, faster, cheaper, and more dev-friendly.&lt;/p&gt;

&lt;p&gt;This blog will explain:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What Railway is&lt;/li&gt;
&lt;li&gt;Why it is exploding in popularity&lt;/li&gt;
&lt;li&gt;Whether it is free or paid&lt;/li&gt;
&lt;li&gt;How it compares to AWS/Render/Vercel&lt;/li&gt;
&lt;li&gt;Who should use it&lt;/li&gt;
&lt;li&gt;A full deployment workflow&lt;/li&gt;
&lt;li&gt;And a final verdict (from a DevOps perspective)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let’s go.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;What Exactly Is Railway?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Railway is a cloud deployment platform (PaaS) where you can deploy apps by simply connecting your GitHub repo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;No servers.&lt;br&gt;
No EC2 instances.&lt;br&gt;
No manual Dockerfiles unless you want to.&lt;br&gt;
No “it works on my machine.”&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Railway handles:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build → Deploy → Run → Logs → Metrics&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Databases → Secrets → Networking → SSL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auto deployments on GitHub push&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It’s like AWS, but:&lt;br&gt;
without 20 dropdowns&lt;br&gt;
without 100 services to configure&lt;br&gt;
without IAM nightmares&lt;br&gt;
Perfect for busy developers.&lt;/p&gt;

&lt;p&gt;Basically Railway is a PaaS abstraction layer, similar to: Heroku, Render , Vercel.&lt;/p&gt;

&lt;p&gt;Their goal is to hide the cloud provider and offer developers&lt;br&gt;
Simple deployments , A unified dashboard , Automatic builds and No cloud complexity&lt;/p&gt;

&lt;p&gt;You care about your app, not the infrastructure.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;How it works ? *&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;When you connect a GitHub repo:&lt;/p&gt;

&lt;p&gt;Railway clones it&lt;br&gt;
Detects language (Java, Node, Python, Go, etc.)&lt;br&gt;
Auto-generates a Docker container if you don’t have one&lt;br&gt;
Builds the container image&lt;br&gt;
Pushes it to Railway’s private registry&lt;br&gt;
Railway = CI + Docker builder + deployment engine.&lt;/p&gt;

&lt;p&gt;Railway schedules your app on managed compute.&lt;br&gt;
Once your container is built, Railway deploys it onto its internal compute layer. This compute layer is not VMs created by Railway, but:&lt;/p&gt;

&lt;p&gt;✔ GCP VMs&lt;br&gt;
✔ Kubernetes-like orchestration&lt;br&gt;
✔ Multi-tenant container scheduling&lt;br&gt;
✔ Cloudflare at the edge&lt;/p&gt;

&lt;p&gt;Railway focuses on simplicity, hiding everything behind:&lt;br&gt;
GitHub deployment&lt;br&gt;
One-click databases&lt;br&gt;
Auto environment variables&lt;br&gt;
Auto SSL&lt;br&gt;
Live logs&lt;br&gt;
Clean UI&lt;/p&gt;

&lt;p&gt;And Isolation is done via:&lt;/p&gt;

&lt;p&gt;Linux namespaces&lt;br&gt;
cgroups&lt;br&gt;
Container sandboxing&lt;br&gt;
Kubernetes pod separation&lt;/p&gt;




&lt;p&gt;Why Railway Is Becoming the Developer Favourite&lt;br&gt;
&lt;strong&gt;1. Ridiculously Simple Deployment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Push → Build → Deploy → URL ready.&lt;br&gt;
No YAML. No infra. No pain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Automatic Dockerization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Even if you don’t write a Dockerfile, it will build one for you.&lt;/p&gt;

&lt;p&gt;Amazing for Node, Python, Java, Go, Rust.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Built-in Databases&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Click → Create PostgreSQL / MySQL / Redis&lt;br&gt;
And Railway injects their connection strings automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Logs, Metrics &amp;amp; Live Terminals&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No more SSHing into servers.&lt;br&gt;
Everything is on one beautiful dashboard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. CI/CD without configuring CI/CD&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every GitHub push becomes a deployment.&lt;br&gt;
Zero setup.&lt;/p&gt;




&lt;p&gt;Is Railway Free or Paid? (Breakdown)&lt;/p&gt;

&lt;p&gt;Railway has two sides:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FREE (Trial/Starter Credits)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;New users get $5 free credits&lt;br&gt;
Some user gets ~$1 free credits/month&lt;/p&gt;

&lt;p&gt;Good for:&lt;/p&gt;

&lt;p&gt;**- Prototypes&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hackathon projects&lt;/li&gt;
&lt;li&gt;Testing APIs&lt;/li&gt;
&lt;li&gt;Learning DevOps workflows**&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But not enough for production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PAID (Usage-Based Pricing)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Starts at $5/month (Hobby Plan)&lt;/p&gt;

&lt;p&gt;You pay for:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CPU&lt;br&gt;
RAM&lt;br&gt;
Storage&lt;br&gt;
Bandwidth&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Still way cheaper and simpler than AWS for small projects.&lt;/p&gt;

&lt;p&gt;✔ Perfect for side projects&lt;br&gt;
✔ Perfect for indie devs&lt;br&gt;
✔ Perfect for POCs&lt;br&gt;
✘ Not ideal for big enterprise workloads&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3e048hqkz40ihok1zkl5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3e048hqkz40ihok1zkl5.png" alt=" " width="800" height="148"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Railway vs AWS vs Render vs Heroku
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Railway&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;AWS&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Heroku&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Render&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Ease of Use&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐&lt;/td&gt;
&lt;td&gt;⭐⭐⭐⭐&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Free Tier&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment Speed&lt;/td&gt;
&lt;td&gt;Fastest&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DB Setup&lt;/td&gt;
&lt;td&gt;1-click&lt;/td&gt;
&lt;td&gt;Complex&lt;/td&gt;
&lt;td&gt;1-click&lt;/td&gt;
&lt;td&gt;1-click&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Logs &amp;amp; Metrics&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;Requires setup&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best For&lt;/td&gt;
&lt;td&gt;Devs, startups&lt;/td&gt;
&lt;td&gt;Enterprises&lt;/td&gt;
&lt;td&gt;Students, hobbyists&lt;/td&gt;
&lt;td&gt;Small production apps&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  How to Deploy an App on Railway (Step-by-Step)
&lt;/h2&gt;

&lt;p&gt;Here’s how deploying a Java Spring Boot, Node.js, or Python Flask app looks:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Push Your Code to GitHub&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Railway works best when connected to GitHub.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Click “New Project → Deploy from GitHub Repo”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Railway auto-detects:&lt;/p&gt;

&lt;p&gt;Language&lt;br&gt;
Framework&lt;br&gt;
Build tool&lt;br&gt;
Start command&lt;br&gt;
No config needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Add Environment Variables&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Everything is managed through a simple UI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Add a Database (Optional)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Click → PostgreSQL → Done.&lt;br&gt;
Railway injects DATABASE_URL automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Watch the Build Logs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It builds your code in the cloud.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Get Your Live Production URL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You're app is deployed. In minutes. No infra.&lt;/p&gt;




&lt;p&gt;Why DevOps Engineers Should Care&lt;/p&gt;

&lt;p&gt;Railway is becoming the ultimate prototyping and lightweight hosting platform best for - &lt;/p&gt;

&lt;p&gt;✔ Microservices&lt;br&gt;
✔ Event-driven jobs&lt;br&gt;
✔ CI/CD experiments&lt;br&gt;
✔ API services&lt;br&gt;
✔ ML inference micro-apps&lt;br&gt;
✔ MVP deployments&lt;br&gt;
✔ Bootstrapping startup ideas&lt;/p&gt;

&lt;p&gt;Railway will save you hours and help you ship faster.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://railway.com/" rel="noopener noreferrer"&gt;https://railway.com/&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.railway.com/" rel="noopener noreferrer"&gt;https://docs.railway.com/&lt;/a&gt;&lt;br&gt;
&lt;a href="https://blog.railway.com/" rel="noopener noreferrer"&gt;https://blog.railway.com/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bounty Program - &lt;a href="https://station.railway.com/" rel="noopener noreferrer"&gt;https://station.railway.com/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let me know you thoughts.......&lt;/p&gt;

</description>
      <category>railwayapp</category>
      <category>devops</category>
      <category>cicd</category>
      <category>paas</category>
    </item>
  </channel>
</rss>
