<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Olivix</title>
    <description>The latest articles on Forem by Olivix (@olivix).</description>
    <link>https://forem.com/olivix</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3849481%2F57be9e4f-bda3-4cbb-9b46-273ca81401e6.png</url>
      <title>Forem: Olivix</title>
      <link>https://forem.com/olivix</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/olivix"/>
    <language>en</language>
    <item>
      <title>Why On-Call Burnout Is an Onboarding Problem (and You Probably Don't See It)</title>
      <dc:creator>Olivix</dc:creator>
      <pubDate>Fri, 10 Apr 2026 00:42:58 +0000</pubDate>
      <link>https://forem.com/olivix/why-on-call-burnout-is-an-onboarding-problem-and-you-probably-dont-see-it-2cn6</link>
      <guid>https://forem.com/olivix/why-on-call-burnout-is-an-onboarding-problem-and-you-probably-dont-see-it-2cn6</guid>
      <description>&lt;p&gt;Here's something nobody talks about: your on-call burnout isn't about being on-call.&lt;/p&gt;

&lt;p&gt;It's about what happens &lt;em&gt;after&lt;/em&gt; you fix something.&lt;/p&gt;

&lt;p&gt;Two weeks ago, a database query locked a critical table for 15 minutes. Cost the company $50K in lost revenue. Took me 30 minutes to fix (restart the query), but 3 hours to figure out &lt;em&gt;why&lt;/em&gt; it happened.&lt;/p&gt;

&lt;p&gt;The fix? We added monitoring. Great.&lt;/p&gt;

&lt;p&gt;But did we actually figure out why someone wrote a query that could lock that table? No. Did we trace back to the feature that introduced it? No. Did we understand the sequence of deploys that made it vulnerable? No.&lt;/p&gt;

&lt;p&gt;So when something similar happens in 6 months (and it will), someone else will spend 3 hours debugging it again.&lt;/p&gt;

&lt;p&gt;This is the pattern that burns people out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;On-call isn't the problem.&lt;/strong&gt; Incident after incident is the problem. And we keep having the same incidents because we're solving them one layer too shallow.&lt;/p&gt;

&lt;p&gt;Junior engineers onboard and within months they're exhausted because they're learning by firefighting. Senior engineers leave because they're tired of playing whack-a-mole. New hires quit during their first on-call rotation because the incidents feel random and unsolvable.&lt;/p&gt;

&lt;p&gt;The real fix isn't a better rotation schedule. It's actually understanding your incidents so deeply that you prevent the &lt;em&gt;class&lt;/em&gt; of incident, not just the symptom.&lt;/p&gt;

&lt;p&gt;That's the only sustainable on-call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tell me:&lt;/strong&gt; What's the biggest gap in your incident analysis process? Are you finding root cause, or just fixing the immediate break?&lt;/p&gt;

&lt;p&gt;Built by: #olivix &lt;a href="https://olivix.app/" rel="noopener noreferrer"&gt;https://olivix.app/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>sre</category>
      <category>devops</category>
      <category>programming</category>
    </item>
    <item>
      <title>The Midnight Incident: When Being On-Call Means Losing Sleep</title>
      <dc:creator>Olivix</dc:creator>
      <pubDate>Thu, 09 Apr 2026 00:48:36 +0000</pubDate>
      <link>https://forem.com/olivix/the-midnight-incident-when-being-on-call-means-losing-sleep-hme</link>
      <guid>https://forem.com/olivix/the-midnight-incident-when-being-on-call-means-losing-sleep-hme</guid>
      <description>&lt;p&gt;It's 3:17 AM on a Wednesday.&lt;/p&gt;

&lt;p&gt;My phone buzzes. Then vibrates. Then buzzes again. The on-call alert I've been dreading since 5pm yesterday finally came through.&lt;/p&gt;

&lt;p&gt;I stumble out of bed, half-awake, and start the familiar dance: Slack, Grafana, CloudWatch, logs. Pieces scattered everywhere. No single view of what's actually happening.&lt;/p&gt;

&lt;p&gt;The message comes through: "Site is down. Revenue is bleeding. Fix it now."&lt;/p&gt;

&lt;p&gt;So I do what I've done a hundred times before. I start connecting dots. A spike in API latency here. Memory usage there. A failed deployment from earlier today. Maybe that?&lt;/p&gt;

&lt;p&gt;It takes me 45 minutes to piece it together. The root cause was a database migration that ran too long, locked a critical table, and brought everything down. But I didn't realize that until I'd already checked 15 different things.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The cost of being on-call isn't just downtime—it's the exhaustion.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I fixed the issue by 4:30 AM. That should mean I could go back to sleep, right? Nope. My brain is running on adrenaline. Anxiety. The replaying of every second wondering if I missed something.&lt;/p&gt;

&lt;p&gt;I don't fall back asleep. I grab coffee at 5am and show up to work pretending everything's fine.&lt;/p&gt;

&lt;p&gt;By 2pm, I'm exhausted. By 6pm, I'm making stupid mistakes in code review because my brain is fried. Tomorrow I'll do the postmortem, except I'm so tired I'll probably miss the actual root cause and just apply a band-aid.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the hidden cost of on-call culture.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And here's what frustrates me most: we have all the data to understand what actually happened. Metrics. Logs. Traces. Deployment records. It's all there. But it's scattered across 5 different tools, and piecing it together requires hours of detective work—especially at 3am when you should be sleeping.&lt;/p&gt;

&lt;p&gt;What if there was a way to see the full incident context immediately? What if instead of being a detective, you could just... know what went wrong?&lt;/p&gt;

&lt;p&gt;That's the vision we're working toward. Because on-call engineers deserve better than midnight guessing games.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real talk:&lt;/strong&gt; Have you been there? That moment at 3am where you're just throwing spaghetti at the wall, hoping something sticks? What was the incident that made you want to pull your hair out?&lt;/p&gt;

&lt;p&gt;Drop your story in the comments — I think we'd all feel less alone knowing we've survived similar chaos.&lt;/p&gt;

&lt;p&gt;Built by: #olivix &lt;a href="https://olivix.app/" rel="noopener noreferrer"&gt;https://olivix.app/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>sre</category>
      <category>devops</category>
      <category>ai</category>
      <category>aiops</category>
    </item>
  </channel>
</rss>
