Forem

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
WTF is Site Reliability Engineering?

WTF is Site Reliability Engineering?

1
Comments
3 min read
ComunicaOps: Criando Alicerces para Construção de Plataformas

ComunicaOps: Criando Alicerces para Construção de Plataformas

3
Comments
2 min read
Blue/Green e Canary no Kubernetes com Argo Rollouts [Lab Session]
Cover image for Blue/Green e Canary no Kubernetes com Argo Rollouts [Lab Session]

Blue/Green e Canary no Kubernetes com Argo Rollouts [Lab Session]

15
Comments
11 min read
Why Platform Engineering? A Tale from a Busy Kitchen
Cover image for Why Platform Engineering? A Tale from a Busy Kitchen

Why Platform Engineering? A Tale from a Busy Kitchen

Comments
1 min read
Unboxing Terraform Internals – Part 1: The Big Picture
Cover image for Unboxing Terraform Internals – Part 1: The Big Picture

Unboxing Terraform Internals – Part 1: The Big Picture

Comments
5 min read
Orchestrating end-to-end service deployment using TypeScript workflows
Cover image for Orchestrating end-to-end service deployment using TypeScript workflows

Orchestrating end-to-end service deployment using TypeScript workflows

4
Comments
2 min read
Build C Projects Like a Pro: A Guide to Idiomatic Makefiles
Cover image for Build C Projects Like a Pro: A Guide to Idiomatic Makefiles

Build C Projects Like a Pro: A Guide to Idiomatic Makefiles

1
Comments 2
7 min read
I Built an AI-Powered CLI to Help Debug Production Incidents | Meet Incident Helper
Cover image for I Built an AI-Powered CLI to Help Debug Production Incidents | Meet Incident Helper

I Built an AI-Powered CLI to Help Debug Production Incidents | Meet Incident Helper

1
Comments
3 min read
Amazon API Gateway Observability Best Practices with Datadog
Cover image for Amazon API Gateway Observability Best Practices with Datadog

Amazon API Gateway Observability Best Practices with Datadog

1
Comments
4 min read
HashiCorp Nomad vs. Kubernetes: Understanding the Workload Orchestrator with Practical Examples

HashiCorp Nomad vs. Kubernetes: Understanding the Workload Orchestrator with Practical Examples

Comments
1 min read
Chaos Engineering in Production: Building Resilient Systems with Chaos Mesh

Chaos Engineering in Production: Building Resilient Systems with Chaos Mesh

Comments
1 min read
When APIs Fail: A Developer's Journey with Retries, Back Off, and Jitter
Cover image for When APIs Fail: A Developer's Journey with Retries, Back Off, and Jitter

When APIs Fail: A Developer's Journey with Retries, Back Off, and Jitter

2
Comments
11 min read
OpenTofu CI/CD Guide: How to Automate Infrastructure Changes with Confidence
Cover image for OpenTofu CI/CD Guide: How to Automate Infrastructure Changes with Confidence

OpenTofu CI/CD Guide: How to Automate Infrastructure Changes with Confidence

1
Comments
3 min read
Cost-Tracking and Model-Spend Monitoring with LiteLLM
Cover image for Cost-Tracking and Model-Spend Monitoring with LiteLLM

Cost-Tracking and Model-Spend Monitoring with LiteLLM

1
Comments
2 min read
Unleashing Resilience: 15+ Essential Chaos Engineering Tools for Robust Systems

Unleashing Resilience: 15+ Essential Chaos Engineering Tools for Robust Systems

Comments
6 min read
AI-Powered Kubernetes Debugging with Python and Ollama
Cover image for AI-Powered Kubernetes Debugging with Python and Ollama

AI-Powered Kubernetes Debugging with Python and Ollama

Comments
6 min read
Understanding `kube-system` in Kubernetes: A City Analogy You’ll Never Forget

Understanding `kube-system` in Kubernetes: A City Analogy You’ll Never Forget

5
Comments
2 min read
Top 15 Must-Have CI/CD Tools for DevOps & SRE Success

Top 15 Must-Have CI/CD Tools for DevOps & SRE Success

Comments
6 min read
Why Was My Localhost SSH Taking 3 Seconds? A Deep Dive.

Why Was My Localhost SSH Taking 3 Seconds? A Deep Dive.

Comments
4 min read
🚀 The Ultimate DevOps Emoji Glossary
Cover image for 🚀 The Ultimate DevOps Emoji Glossary

🚀 The Ultimate DevOps Emoji Glossary

1
Comments
2 min read
10 Essential Tips for Setting Up Monitoring for Your SaaS
Cover image for 10 Essential Tips for Setting Up Monitoring for Your SaaS

10 Essential Tips for Setting Up Monitoring for Your SaaS

Comments
5 min read
Kubernetes Node Management - Drain, Cordon and Uncordon

Kubernetes Node Management - Drain, Cordon and Uncordon

6
Comments
2 min read
Mastering `map()` and `tolist()` in Terraform 🧰
Cover image for Mastering `map()` and `tolist()` in Terraform 🧰

Mastering `map()` and `tolist()` in Terraform 🧰

Comments
2 min read
Why Use a Status Page Aggregator?
Cover image for Why Use a Status Page Aggregator?

Why Use a Status Page Aggregator?

Comments
5 min read
How to Write Effective Incident Post-Mortems: A Complete Guide
Cover image for How to Write Effective Incident Post-Mortems: A Complete Guide

How to Write Effective Incident Post-Mortems: A Complete Guide

6
Comments
6 min read
loading...