I Built an Open-Source CLI That Diagnoses Production Incidents in 30 Seconds — Looking for Contributors

Zeel Patel — Mon, 09 Mar 2026 20:58:15 +0000

Every engineer who's been on-call knows the drill.

It's 3 AM. PagerDuty goes off. You open your laptop, squint at CloudWatch, start grepping through thousands of log lines, flip over to GitHub to check if anyone deployed recently, then paste everything into an AI chat hoping it can make sense of the mess.

45 minutes later, you find it. Someone changed the Redis connection pool from 50 to 5.

I got tired of doing this manually, so I built AUTOPSY — an open-source Python CLI that does the entire investigation in one command.

pip install autopsy-cli
autopsy diagnose

It pulls your last 30 minutes of error logs from AWS CloudWatch, fetches recent commits and diffs from GitHub, sends everything to an AI (Claude or GPT-4o), and prints a structured root cause analysis directly in your terminal:

The whole thing runs locally. No agents, no platform, no servers. Your logs go from AWS directly to the AI provider using your own credentials. Nothing touches our infrastructure.

How It Works Under the Hood

The architecture is a modular pipeline:

CLI (Click)
  └── DiagnosisOrchestrator
        ├── CloudWatchCollector  →  AWS Logs Insights (boto3)
        ├── GitHubCollector      →  Commits + diffs (PyGitHub)
        └── AIEngine             →  Anthropic / OpenAI
                └── TerminalRenderer (Rich)

A four-stage log reduction pipeline compresses raw CloudWatch output to fit LLM context windows — regex filtering, SHA256 deduplication, truncation, and a hard 6,000-token budget. The AI response is validated against a Pydantic schema, with automatic retry on malformed output.

Every collector implements a BaseCollector interface, so adding new data sources (Datadog, ELK, GCP) is a single new class.

Tech stack: Python 3.10–3.13, Click, boto3, PyGitHub, Rich, questionary, Pydantic v2, Anthropic + OpenAI SDKs.

Why I'm Looking for Contributors

AUTOPSY is live on PyPI with 149 passing tests and full CI/CD (GitHub Actions → PyPI via OIDC). The core diagnosis pipeline works. But there's a lot of surface area to cover, and I want to build this with the community, not in isolation.

I've created 17 open issues across three difficulty levels:

🟢 Good First Issues (Great for First-Time Contributors)

These are scoped, well-documented, and perfect if you want to make your first open-source PR:

Add detailed --version output — show Python version, OS, prompt version
Add configurable log severity filter — let users control which log levels get pulled
Add CONTRIBUTING.md — help future contributors get started
Add PR template — standardize pull requests
Improve expired AWS credential error messages — better error UX

🟡 Medium Issues

Datadog Logs collector — many teams aren't on CloudWatch
GitLab collector — not everyone uses GitHub
--demo mode — let new users see AUTOPSY work without any credentials
Diagnosis history (SQLite) — persist past diagnoses locally
Slack notification — post diagnosis results to an incident channel
Parallel collector execution — speed up multi-log-group queries with asyncio

🔴 Advanced Issues

ELK / OpenSearch collector
Ollama support — fully local LLM for teams that can't send logs to cloud providers
Prompt evaluation harness — automated accuracy testing against known incidents
GCP Cloud Logging collector
Auto-generated post-mortem documents

Every issue has clear acceptance criteria, implementation hints, and links to the relevant source files.

Getting Started

# Clone and install in dev mode
git clone https://github.com/zaappy/autopsy.git
cd autopsy
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
ruff check .

The codebase is clean, strictly linted (ruff, 7 rule sets), type-checked (mypy strict mode), and every module has test coverage. You won't be guessing how things work.

The Bigger Picture

AUTOPSY targets the one phase of the incident lifecycle that nobody owns: diagnosis. Detection is solved (Datadog, PagerDuty, Grafana). Response coordination is solved (Rootly, incident.io). But the moment between "alert fired" and "engineer understands why" — that's still manual grep and intuition at most companies.

The funded players in this space (Ciroos at $21M, incident.io at $28M+) are all building expensive enterprise platforms. Nobody is building the simple, free tool that an individual engineer can install in 30 seconds. That's the gap.

The CLI is and will always be free and open-source. A paid team layer (AUTOPSY Cloud) is on the roadmap for teams that need persistent history, shared dashboards, and Slack integration.

Forem: Zeel Patel