Forem

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
OpenTelemetry vs Logstash - Which Logging Tool Is Right for You?

OpenTelemetry vs Logstash - Which Logging Tool Is Right for You?

1
Comments
9 min read
Your Kubernetes Cluster Shouldn't Need You at 3am

Your Kubernetes Cluster Shouldn't Need You at 3am

Comments
1 min read
Your NOC Team Will Be 80% Smaller in 3 Years. Here's Why That's Not a Bad Thing.

Your NOC Team Will Be 80% Smaller in 3 Years. Here's Why That's Not a Bad Thing.

Comments
2 min read
SLIs, SLOs, SLAs: The Guide to SRE’s Secret Sauce
Cover image for SLIs, SLOs, SLAs: The Guide to SRE’s Secret Sauce

SLIs, SLOs, SLAs: The Guide to SRE’s Secret Sauce

Comments
3 min read
What “Read-Only Fridays” Quietly Reveal About Your Platform

What “Read-Only Fridays” Quietly Reveal About Your Platform

Comments 1
1 min read
Setup NUT on Proxmox

Setup NUT on Proxmox

Comments
3 min read
Build an AI Code Review Agent in GitHub Actions (That Actually Reduces Incidents
Cover image for Build an AI Code Review Agent in GitHub Actions (That Actually Reduces Incidents

Build an AI Code Review Agent in GitHub Actions (That Actually Reduces Incidents

2
Comments
4 min read
API Uptime SLA: What 99.9% Really Means for Your Application

API Uptime SLA: What 99.9% Really Means for Your Application

Comments
6 min read
Your Traces Look Fine. Your Revenue Isn’t.
Cover image for Your Traces Look Fine. Your Revenue Isn’t.

Your Traces Look Fine. Your Revenue Isn’t.

1
Comments
2 min read
Chaos Engineering Lite: Testing your AWS Alarms with Intentional Failures
Cover image for Chaos Engineering Lite: Testing your AWS Alarms with Intentional Failures

Chaos Engineering Lite: Testing your AWS Alarms with Intentional Failures

Comments
1 min read
O que realmente quebra em migrações de nuvem em larga escala — Solução !
Cover image for O que realmente quebra em migrações de nuvem em larga escala — Solução !

O que realmente quebra em migrações de nuvem em larga escala — Solução !

Comments
4 min read
LGTM != Production Ready: Why your CI pipeline is missing the most important step
Cover image for LGTM != Production Ready: Why your CI pipeline is missing the most important step

LGTM != Production Ready: Why your CI pipeline is missing the most important step

Comments
3 min read
Why AI SRE tools don't work (and what we're doing differently)
Cover image for Why AI SRE tools don't work (and what we're doing differently)

Why AI SRE tools don't work (and what we're doing differently)

4
Comments 2
4 min read
Linux Privileges:Peeling Back the Curtain Of How Linux Really Handles Users, Privileges, and Processes
Cover image for Linux Privileges:Peeling Back the Curtain Of How Linux Really Handles Users, Privileges, and Processes

Linux Privileges:Peeling Back the Curtain Of How Linux Really Handles Users, Privileges, and Processes

4
Comments
5 min read
Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)
Cover image for Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)

Docker Monitoring Without a Platform: docker stats + cgroups (DevOps)

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.