Forem

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Mastering LVM: From Basics to Advanced Migration, Backup & Recovery

Mastering LVM: From Basics to Advanced Migration, Backup & Recovery

1
Comments
6 min read
Microservices and the Myth of Fault Isolation
Cover image for Microservices and the Myth of Fault Isolation

Microservices and the Myth of Fault Isolation

Comments
3 min read
The Hidden Cost of AI in SRE: Why Automation Hasn’t Fixed Burnout
Cover image for The Hidden Cost of AI in SRE: Why Automation Hasn’t Fixed Burnout

The Hidden Cost of AI in SRE: Why Automation Hasn’t Fixed Burnout

1
Comments
2 min read
The Merge Queue Scaling Problem Every Growing Team Hits

The Merge Queue Scaling Problem Every Growing Team Hits

Comments
1 min read
Breaking Things on Purpose: What I Learned from Netflix’s Chaos Monkey

Breaking Things on Purpose: What I Learned from Netflix’s Chaos Monkey

8
Comments 4
2 min read
🔐Raise your hand if you use SSH every day without actually knowing what it does. Yeah, me too😁 you’re definitely not alone.
Cover image for 🔐Raise your hand if you use SSH every day without actually knowing what it does. Yeah, me too😁 you’re definitely not alone.

🔐Raise your hand if you use SSH every day without actually knowing what it does. Yeah, me too😁 you’re definitely not alone.

9
Comments 3
4 min read
OOMKilled Pods: A guide to troubleshooting.
Cover image for OOMKilled Pods: A guide to troubleshooting.

OOMKilled Pods: A guide to troubleshooting.

Comments
5 min read
logbloglogbloglogblog
Cover image for logbloglogbloglogblog

logbloglogbloglogblog

Comments
4 min read
Why You're Spending Too Much Money on Datadog Metrics
Cover image for Why You're Spending Too Much Money on Datadog Metrics

Why You're Spending Too Much Money on Datadog Metrics

1
Comments
2 min read
Gonzo - The Go based TUI for log analysis
Cover image for Gonzo - The Go based TUI for log analysis

Gonzo - The Go based TUI for log analysis

Comments
1 min read
Why Self-Hosting made me a better engineer

Why Self-Hosting made me a better engineer

1
Comments
4 min read
Linux Fundamentals for DevOps & SRE: The Only Guide You'll Ever Need

Linux Fundamentals for DevOps & SRE: The Only Guide You'll Ever Need

10
Comments
15 min read
Kubernetes Storage: Trading a Ferrari for a Reliable Minivan.
Cover image for Kubernetes Storage: Trading a Ferrari for a Reliable Minivan.

Kubernetes Storage: Trading a Ferrari for a Reliable Minivan.

1
Comments 2
3 min read
Take Control of your Logs: Top 10 ways using the OpenTelemetry Collector
Cover image for Take Control of your Logs: Top 10 ways using the OpenTelemetry Collector

Take Control of your Logs: Top 10 ways using the OpenTelemetry Collector

Comments
2 min read
Importance of Graceful Shutdown in Kubernetes

Importance of Graceful Shutdown in Kubernetes

3
Comments
7 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.