<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Zeel Patel</title>
    <description>The latest articles on Forem by Zeel Patel (@zeel_patel).</description>
    <link>https://forem.com/zeel_patel</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1835942%2Fea032fdc-98a8-4f51-a089-5f994a75052c.jpg</url>
      <title>Forem: Zeel Patel</title>
      <link>https://forem.com/zeel_patel</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/zeel_patel"/>
    <language>en</language>
    <item>
      <title>I Built an Open-Source CLI That Diagnoses Production Incidents in 30 Seconds — Looking for Contributors</title>
      <dc:creator>Zeel Patel</dc:creator>
      <pubDate>Mon, 09 Mar 2026 20:58:15 +0000</pubDate>
      <link>https://forem.com/zeel_patel/i-built-an-open-source-cli-that-diagnoses-production-incidents-in-30-seconds-looking-for-4p63</link>
      <guid>https://forem.com/zeel_patel/i-built-an-open-source-cli-that-diagnoses-production-incidents-in-30-seconds-looking-for-4p63</guid>
      <description>&lt;p&gt;Every engineer who's been on-call knows the drill.&lt;/p&gt;

&lt;p&gt;It's 3 AM. PagerDuty goes off. You open your laptop, squint at CloudWatch, start grepping through thousands of log lines, flip over to GitHub to check if anyone deployed recently, then paste everything into an AI chat hoping it can make sense of the mess.&lt;/p&gt;

&lt;p&gt;45 minutes later, you find it. Someone changed the Redis connection pool from 50 to 5.&lt;/p&gt;

&lt;p&gt;I got tired of doing this manually, so I built &lt;strong&gt;AUTOPSY&lt;/strong&gt; — an open-source Python CLI that does the entire investigation in one command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;autopsy-cli
autopsy diagnose
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It pulls your last 30 minutes of error logs from AWS CloudWatch, fetches recent commits and diffs from GitHub, sends everything to an AI (Claude or GPT-4o), and prints a structured root cause analysis directly in your terminal:&lt;/p&gt;

&lt;p&gt;The whole thing runs locally. No agents, no platform, no servers. Your logs go from AWS directly to the AI provider using your own credentials. Nothing touches our infrastructure.&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Works Under the Hood
&lt;/h2&gt;

&lt;p&gt;The architecture is a modular pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CLI (Click)
  └── DiagnosisOrchestrator
        ├── CloudWatchCollector  →  AWS Logs Insights (boto3)
        ├── GitHubCollector      →  Commits + diffs (PyGitHub)
        └── AIEngine             →  Anthropic / OpenAI
                └── TerminalRenderer (Rich)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A four-stage log reduction pipeline compresses raw CloudWatch output to fit LLM context windows — regex filtering, SHA256 deduplication, truncation, and a hard 6,000-token budget. The AI response is validated against a Pydantic schema, with automatic retry on malformed output.&lt;/p&gt;

&lt;p&gt;Every collector implements a &lt;code&gt;BaseCollector&lt;/code&gt; interface, so adding new data sources (Datadog, ELK, GCP) is a single new class.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech stack:&lt;/strong&gt; Python 3.10–3.13, Click, boto3, PyGitHub, Rich, questionary, Pydantic v2, Anthropic + OpenAI SDKs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I'm Looking for Contributors
&lt;/h2&gt;

&lt;p&gt;AUTOPSY is live on PyPI with 149 passing tests and full CI/CD (GitHub Actions → PyPI via OIDC). The core diagnosis pipeline works. But there's a lot of surface area to cover, and I want to build this with the community, not in isolation.&lt;/p&gt;

&lt;p&gt;I've created &lt;strong&gt;17 open issues&lt;/strong&gt; across three difficulty levels:&lt;/p&gt;

&lt;h3&gt;
  
  
  🟢 Good First Issues (Great for First-Time Contributors)
&lt;/h3&gt;

&lt;p&gt;These are scoped, well-documented, and perfect if you want to make your first open-source PR:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Add detailed &lt;code&gt;--version&lt;/code&gt; output&lt;/strong&gt; — show Python version, OS, prompt version&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add configurable log severity filter&lt;/strong&gt; — let users control which log levels get pulled&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add &lt;code&gt;CONTRIBUTING.md&lt;/code&gt;&lt;/strong&gt; — help future contributors get started&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add PR template&lt;/strong&gt; — standardize pull requests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improve expired AWS credential error messages&lt;/strong&gt; — better error UX&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🟡 Medium Issues
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Datadog Logs collector&lt;/strong&gt; — many teams aren't on CloudWatch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitLab collector&lt;/strong&gt; — not everyone uses GitHub&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;--demo&lt;/code&gt; mode&lt;/strong&gt; — let new users see AUTOPSY work without any credentials&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diagnosis history (SQLite)&lt;/strong&gt; — persist past diagnoses locally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slack notification&lt;/strong&gt; — post diagnosis results to an incident channel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel collector execution&lt;/strong&gt; — speed up multi-log-group queries with asyncio&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔴 Advanced Issues
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;ELK / OpenSearch collector&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama support&lt;/strong&gt; — fully local LLM for teams that can't send logs to cloud providers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prompt evaluation harness&lt;/strong&gt; — automated accuracy testing against known incidents&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GCP Cloud Logging collector&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Auto-generated post-mortem documents&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every issue has clear acceptance criteria, implementation hints, and links to the relevant source files.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone and install in dev mode&lt;/span&gt;
git clone https://github.com/zaappy/autopsy.git
&lt;span class="nb"&gt;cd &lt;/span&gt;autopsy
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[dev]"&lt;/span&gt;

&lt;span class="c"&gt;# Run tests&lt;/span&gt;
pytest

&lt;span class="c"&gt;# Run linting&lt;/span&gt;
ruff check &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The codebase is clean, strictly linted (ruff, 7 rule sets), type-checked (mypy strict mode), and every module has test coverage. You won't be guessing how things work.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;AUTOPSY targets the one phase of the incident lifecycle that nobody owns: &lt;strong&gt;diagnosis&lt;/strong&gt;. Detection is solved (Datadog, PagerDuty, Grafana). Response coordination is solved (Rootly, incident.io). But the moment between "alert fired" and "engineer understands why" — that's still manual grep and intuition at most companies.&lt;/p&gt;

&lt;p&gt;The funded players in this space (Ciroos at $21M, incident.io at $28M+) are all building expensive enterprise platforms. Nobody is building the simple, free tool that an individual engineer can install in 30 seconds. That's the gap.&lt;/p&gt;

&lt;p&gt;The CLI is and will always be free and open-source. A paid team layer (AUTOPSY Cloud) is on the roadmap for teams that need persistent history, shared dashboards, and Slack integration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/zaappy/autopsy" rel="noopener noreferrer"&gt;github.com/zaappy/autopsy&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;code&gt;pip install autopsy-cli&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Issues:&lt;/strong&gt; &lt;a href="https://github.com/zaappy/autopsy/issues" rel="noopener noreferrer"&gt;github.com/zaappy/autopsy/issues&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you've ever been woken up at 3 AM by a production incident, you know the pain this solves. Come build it with me.&lt;/p&gt;

&lt;p&gt;Star the repo if this resonates, and grab an issue if you want to contribute. Every PR gets a proper review and every contributor gets credited.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by &lt;a href="https://github.com/zeelapatel" rel="noopener noreferrer"&gt;Zeel&lt;/a&gt; with help from Claude Code.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>devops</category>
      <category>hacktoberfest</category>
    </item>
  </channel>
</rss>
