Forem: Praveen

Why AI provenance tools fail when their layers disagree

Praveen — Sun, 24 May 2026 05:05:12 +0000

Most people think the hard part of an AI provenance tool is capturing the prompt or parsing the model output. That is only the first layer of the problem. The more serious failure appears after the system has multiple moving parts: an editor extension, a backend, and an assistant-facing API all trying to describe the same event.

That is where trust starts to break.

A provenance system is supposed to answer a simple question: what happened to this change, and how did it get here? But once the extension, backend, and MCP server all participate in that answer, any mismatch in response shape, error handling, or mode-specific behavior becomes user-visible. A redirect that is helpful during setup can become opaque during login. A workspace response that is technically correct can still be formatted incorrectly for the MCP layer. A Lite-only feature gate can look like an authentication failure if the error mapping is too generic. None of those are parsing bugs. They are consistency bugs.

This is why contract drift matters so much in AI infrastructure tools. The system is not just moving data. It is narrating reality across surfaces. If one surface says “setup needed,” another says “login failed,” and a third says “feature unavailable,” the user no longer knows which layer to believe.

In LineageLens, the recent fixes were all about reducing that kind of ambiguity. Fresh installs now get a real auth response instead of an opaque redirect. The MCP server matches the workspace response shape that the backend actually returns. Lite-mode 403s surface the backend’s upgrade message instead of a misleading auth template. Ingest warnings and duplicate storage status now reach the user instead of disappearing silently. Even the token lifecycle became more robust by supporting refresh before falling back to password re-login.

That is the practical lesson: once a product spans multiple clients, you need contract discipline at the boundaries. Not just tests for the core logic, but tests for the truth that each layer tells the next one.

For AI provenance tools, that truth is the product. If the extension, backend, and MCP server disagree, the audit trail becomes noisy instead of useful. And if the audit trail is noisy, the whole category loses value.

The fix is not glamorous. It is boundary work: stable payloads, mode-aware errors, better token handling, and fewer assumptions about what another layer “probably meant.” But that is exactly the kind of work that makes a provenance system trustworthy.

Title: LineageLens: A "Git Blame" for AI-Generated Code

Praveen — Sat, 23 May 2026 05:58:45 +0000

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

When an engineer uses an AI agent in the terminal to write or refactor code, git simply records the engineer as the author. The context of which prompt generated it, which model was used, and how many iterations it took is completely lost the moment the terminal session ends.

I built LineageLens to fix this. It is an open-source, self-hosted proxy (running on port 8788) that intercepts AI dev tool traffic. It parses the native AI tool calls and logs the exact prompt, model, and applied edit to a local database, creating a searchable audit trail and dashboard for AI-generated code.

Demo

GitHub Repository: karnati-praveen/lineagelens
VS Code Extension: LineageLens Marketplace

The Comeback Story

The initial prototype of LineageLens was a massive headache. It relied heavily on brittle regex text-scraping to pull code blocks out of standard LLM markdown responses. It broke constantly, and the project stalled because it couldn't tell if a developer actually accepted the AI's suggestion or rejected it.

For this challenge, I completely ripped out the regex engine and started over. I built native protocol adapters that parse Anthropic’s tool_use blocks and OpenAI’s apply_patch DSL directly from the API streams. More importantly, I introduced a state machine. It now correlates an AI's proposed edit with the next turn's tool_result to definitively track if the code was applied, rejected, or errored. It transformed the project from a noisy text logger into a highly accurate governance tool.

My Experience with GitHub Copilot

Rebuilding the core engine required handling complex, fragmented Server-Sent Events (SSE) and assembling streaming JSON payloads for the tool calls. GitHub Copilot was instrumental in accelerating this refactor. It helped quickly scaffold the FastAPI endpoints, write the tedious string-parsing logic for the proxy stream interception, and auto-complete the SQLAlchemy models needed for the new state-machine database architecture. It turned weeks of manual API debugging into just a few days of rapid implementation.

The "Ghost in the Repo": Why AI Agents break Git Blame

Praveen — Fri, 22 May 2026 04:48:23 +0000

For the last 15 years, git blame has been the ultimate source of truth for software engineering. If a production bug surfaces, or a security auditor asks why a specific database query was written a certain way, git blame tells you exactly who to ask.

With the rapid adoption of agentic CLI tools like Claude Code, OpenAI Codex, and Aider, that source of truth is silently breaking.

The Context Collapse

When you use an AI agent to write code, the workflow looks like this:

You open the terminal and type: "Add a JWT verification middleware, skip checking the expiration for now."
The AI uses a tool (like Anthropic's tool_use or OpenAI's apply_patch) to edit auth.py.
You review the diff in your terminal, hit 'y' to accept, and commit the code.

Here is the problem: Git only records Step 3.

The most critical piece of context—the intent ("skip checking the expiration"), the model used (claude-3-5-sonnet), and the fact that an AI generated it—evaporates the moment you close the terminal. We are filling our repositories with "Ghosts"—code that looks like it was written by a human, but lacks any human architectural intent.

Why this is a Security Nightmare

If you are a solo developer, this is just annoying. If you are an engineering manager or a CISO, this is a massive compliance blindspot.

When a vulnerability scanner flags that JWT middleware three months from now, the reviewing engineer will see your name on the commit. They will assume you had a specific, undocumented business reason for skipping the expiration check. They won't know it was a hallucinated shortcut taken by an AI model.

Fixing it at the Proxy Layer

To solve this for my own workflows, I realized that scraping text or using git hooks wouldn't work. By the time code hits git, it's too late. The provenance is gone.

I recently open-sourced LineageLens, a self-hosted intercepting proxy designed specifically for AI agents. Instead of looking at git, it sits between your terminal and the AI provider.

Because it intercepts the raw API traffic, it can parse the actual structured tool calls. It builds a state machine to track when an AI proposes an edit, and correlates it with the subsequent tool_result to confirm if the developer actually applied it.

The result is a local, searchable audit trail that answers: "Which code in our repo was AI-generated, by which model, with what exact prompt?" If you are interested in how the proxy parses these agentic protocols, or if you want to run the single-container SQLite version to track your own AI usage this weekend, the repo is live here: LineageLens on GitHub.

Are you currently tracking AI provenance in your repos, or are you flying blind? Let me know in the comments.