Forem

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
What a 60-second war-room scan reveals
Cover image for What a 60-second war-room scan reveals

What a 60-second war-room scan reveals

Comments
3 min read
A Measurable Snapchat Proxy Validation Mini Lab You Can Run This Week
Cover image for A Measurable Snapchat Proxy Validation Mini Lab You Can Run This Week

A Measurable Snapchat Proxy Validation Mini Lab You Can Run This Week

Comments
6 min read
The "DevOps Engineer" is Dead. Long Live the Platform Architect.
Cover image for The "DevOps Engineer" is Dead. Long Live the Platform Architect.

The "DevOps Engineer" is Dead. Long Live the Platform Architect.

5
Comments
2 min read
DevOps vs SRE vs Platform Engineering: What’s the Difference?

DevOps vs SRE vs Platform Engineering: What’s the Difference?

1
Comments
2 min read
Datadog vs OneUptime vs OptyxStack – Understanding the Differences in Observability and Operations

Datadog vs OneUptime vs OptyxStack – Understanding the Differences in Observability and Operations

5
Comments
2 min read
DevOps com IA: Quem Está no Controle do Pipeline?
Cover image for DevOps com IA: Quem Está no Controle do Pipeline?

DevOps com IA: Quem Está no Controle do Pipeline?

Comments
13 min read
Spegel, Pixie, and Why :latest Is Evil
Cover image for Spegel, Pixie, and Why :latest Is Evil

Spegel, Pixie, and Why :latest Is Evil

Comments
4 min read
Rotating Residential Proxy Evaluation Mini-Lab You Can Run in 90 Minutes
Cover image for Rotating Residential Proxy Evaluation Mini-Lab You Can Run in 90 Minutes

Rotating Residential Proxy Evaluation Mini-Lab You Can Run in 90 Minutes

Comments
6 min read
From cronjobs to controllers: Building a production-grade Kubernetes Backup & Restore Operator
Cover image for From cronjobs to controllers: Building a production-grade Kubernetes Backup & Restore Operator

From cronjobs to controllers: Building a production-grade Kubernetes Backup & Restore Operator

1
Comments
4 min read
Workflow Deep Dive

Workflow Deep Dive

Comments
1 min read
Building a Config Drift Detector for AWS (with Snapshots, Lambdas, and a Next.js Dashboard)
Cover image for Building a Config Drift Detector for AWS (with Snapshots, Lambdas, and a Next.js Dashboard)

Building a Config Drift Detector for AWS (with Snapshots, Lambdas, and a Next.js Dashboard)

Comments
5 min read
Running Cluster on 100% Spot Instances: How K8s Does It Better Than ECS

Running Cluster on 100% Spot Instances: How K8s Does It Better Than ECS

Comments
4 min read
Two Terraform Traps That Burned Me: Hidden Defaults & Circular Dependencies

Two Terraform Traps That Burned Me: Hidden Defaults & Circular Dependencies

Comments
4 min read
Why Your Engineering Wiki is a Graveyard (And How to Fix It)

Why Your Engineering Wiki is a Graveyard (And How to Fix It)

Comments
3 min read
How to Make Engineering Knowledge Searchable (A Complete Guide)

How to Make Engineering Knowledge Searchable (A Complete Guide)

1
Comments
3 min read
Shift-Left Reliability

Shift-Left Reliability

Comments
4 min read
How to pass the CKA Exam on the first try [GUARANTEED]
Cover image for How to pass the CKA Exam on the first try [GUARANTEED]

How to pass the CKA Exam on the first try [GUARANTEED]

Comments 1
4 min read
You’re Running EC2 Instances That Do Nothing
Cover image for You’re Running EC2 Instances That Do Nothing

You’re Running EC2 Instances That Do Nothing

1
Comments
2 min read
10 Proven Ways to Cut Your AWS Bill
Cover image for 10 Proven Ways to Cut Your AWS Bill

10 Proven Ways to Cut Your AWS Bill

1
Comments
3 min read
AWS DevOps Agent
Cover image for AWS DevOps Agent

AWS DevOps Agent

Comments
4 min read
Why Most DevOps Tutorials Fail in Production Environments

Why Most DevOps Tutorials Fail in Production Environments

Comments
2 min read
Kubernetes Persistence Series Part 3: Controllers & Resilience — Why Kubernetes Self-Heals
Cover image for Kubernetes Persistence Series Part 3: Controllers & Resilience — Why Kubernetes Self-Heals

Kubernetes Persistence Series Part 3: Controllers & Resilience — Why Kubernetes Self-Heals

8
Comments
4 min read
Kubernetes Persistence Series Part 1: When Our Ingress Vanished After a Node Upgrade
Cover image for Kubernetes Persistence Series Part 1: When Our Ingress Vanished After a Node Upgrade

Kubernetes Persistence Series Part 1: When Our Ingress Vanished After a Node Upgrade

9
Comments
4 min read
Building a Multi-Account CloudWatch Dashboard That Actually Works

Building a Multi-Account CloudWatch Dashboard That Actually Works

5
Comments
2 min read
Virtual Private Cloud Spiegato Semplice
Cover image for Virtual Private Cloud Spiegato Semplice

Virtual Private Cloud Spiegato Semplice

Comments
3 min read
loading...